Various lists in Haskell
List
There is a lot of syntactic sugar for lists in Haskell. Thus, lists are used for a lot of different purposes (e.g. String)
List are default data structures in functionnl languages much as arrays are in imperative languages
However, lists only support only very few operations efficiently.
code:operations.hs
[] :: a -- O(1)
(:) :: a -> a -> a -- O(1)
head :: a -> a -- O(1)
tail :: a -> a -- O(1)
snoc :: a -> a -> a -- O(n)
(!!) :: a -> Int -> a -- O(n)
(++) :: a -> a -> a -- O(m)(first list)
reverse :: a -> a -- O(n)
splitAt :: Int -> a -> (a,a) -- O(n)
union :: Eq a => a -> a -> a -- O(mn)
elem :: Eq a => a -> a -> Boole -- O(n)
List are suitable for use if:
Most operations we need are stack operations
Maximal size of the lists we deal with is relatively small
List are generally not suitable:
For random access
For set operations sucha as union and intersection
To deal with (really) large amount of text via String
Map
Implementation based upon binary search trees
Available in Data.Map
Key in the tree are ordered, so that efficient lookup is possible
Requires the keys to be in Ord
Inserting and removing elements can trigger rotations to rebalance the tree
Everything happens in a persistent setting
code:interface.hs
data Map k a -- abstract
empty :: Map k a -- O(1)
insert :: (Ord k) => k -> a -> Map k a -> Map k a -- O(log n)
lookup :: (Ord k) => k -> Map k a -> Maybe a -- O(log n)
delete :: (Ord k) => k -> Map k a -> Map k a -- O(log n)
update :: (Ord k) => (a -> Maybe a) -> k -> Map k a -> Map k a -- O(log n)
union :: (Ord k) => Map k a -> Map k a -> Map k a -- O(m + n)
member :: (Ord k) => k -> Map k a -> Bool -- O(log n)
size :: Map k a -> Int -- O(1)
map :: (a -> b) -> Map k a -> Map k b -- O(n)
Sets
Sets are special case of finite maps: essentially,
type Set a = Map a ()
A specialized set implementation is available in Data.Set.
Sequences
Sometimes, we need a data structure with
Efficient random access to arbitrary elements
Very efficient access to both ends
Efficient concatenation and splitting
This is offered by the Data.Sequence library from containers package.
Sequences are implemented as an special form of trees called finger trees
code:interface.hs
data Seq a -- abstract
empty :: Seq a -- O(1)
(<|) :: a -> Seq a -> Seq a -- O(1)
(|>) :: Seq a -> a -> Seq a -- O(1)
(><) :: Seq a -> Seq a -> Seq a -- O(log(min(m, n)))
null :: Seq a -> Bool -- O(1)
length :: Seq a -> Int -- O(1)
filter :: (a -> Bool) -> Seq a -> Seq a -- O(n)
fmap :: (a -> b) -> Seq a -> Seq b -- O(n)
index :: Seq a -> Int -> a -- O(log(min(i, n - 1)))
splitAt :: Seq a -> Int -> (Seq a, Seq a) -- O(log(min(i, n- i)))
Vector
Data.Vector is offered from vector library.
vector offers efficient slicing
It additionally stores a lower and upper bound. These are used to implement the slicing operations.
code:interface.hs
data Vector a -- abstract
empty :: Vector a -- O(1)
generate :: Int -> (Int -> a) -> Vector a -- O(n)
fromList :: a -> Vector a -- o(n)
(++) :: Vector a -> Vector a -> Vector a -- O(m + n)
(!) :: Vector a -> Int -> a -- O(1)
(!?) :: Vector a -> Int -> Maybe a -- O(1)
slice :: Int -> Int -> Vector a -> Vector -- O(1)
(//) :: Vector a -> (Int, a) -> Vector a -- O(m + n)
map :: (a -> b) -> Vector a -> Vector b -- O(n)
filter :: (a -> Bool) -> Vector a -> Vector a -- O(n)
foldr :: (a -> b -> b) -> b -> Vector a -> b -- O(n)
(Important) Advice on persistent arrays
Stay away if you require a large number of small incremental updates. Finite maps or sequences are much more suited
Arrays can be useful if you have an essentially constant table that you need to access frequently.
Array can also be useful if you perform global updates on them anyway.
Unboxed Vector
Data.Vector.Unboxed is similar to Data.Vector
code:interface.hs
data Vector a -- abstract
empty :: Unbox a => Vector a -- O(1)
generate :: Unbox a => Int -> (Int -> a) -> Vector a -- O(n)
fromList :: Unbox a => a -> Vector a -- O(n)
(++) :: Unbox a => Vector a -> Vector a -> Vector a -- O(m + n)
(!) :: Unbox a => Vector a -> Int -> a -- O(1)
(!?) :: Unbox a => Vector a -> Int -> Maybe a -- O(1)
slice :: Unbox a => Int -> Int -> Vector a -> Vector -- O(1)
(//) :: Unbox a => Vector a -> (Int, a) -> Vector a -- O(m + n)
map :: (Unbox a, Unbox b) => (a -> b) -> Vector a -> Vector b -- O(n)
filter :: Unbox a => (a -> Bool) -> Vector a -> Vector a -- O(n)
foldr :: Unbox a => (a -> b -> b) -> b -> Vector a -> b -- O(n)
Mutable vectors
Some algorithms can be implemented more efficiently in the presence of destructive updates
Haskell has mutable data structures as well as immutable ones
Most operations on mutable data structures have IO type. (e.g IORef, TVar, MVar)
Both Data.Vector.Mutable and Data.Vector.Unboxed.Mutable exports a datatype
data IOVector (a :: *) abstract
code:interface.hs
new :: Int -> IO (IOVector a)
replicate :: Int -> a -> IO (IOVector a)
read :: IOVector a -> Int -> IO a
write :: IOVector a -> Int -> a -> IO ()
https://gyazo.com/8eee7138bb34d79e3ea0bd36e4c72e94
#Strings
#Merkle_tree
#Binary_Serialization