Various lists in Haskell

List

There is a lot of syntactic sugar for lists in Haskell. Thus, lists are used for a lot of different purposes (e.g. String)

List are default data structures in functionnl languages much as arrays are in imperative languages

However, lists only support only very few operations efficiently.

code:operations.hs

[] :: a -- O(1)

(:) :: a -> a -> a -- O(1)

head :: a -> a -- O(1)

tail :: a -> a -- O(1)

snoc :: a -> a -> a -- O(n)

(!!) :: a -> Int -> a -- O(n)

(++) :: a -> a -> a -- O(m)(first list)

reverse :: a -> a -- O(n)

splitAt :: Int -> a -> (a,a) -- O(n)

union :: Eq a => a -> a -> a -- O(mn)

elem :: Eq a => a -> a -> Boole -- O(n)

List are suitable for use if:

Most operations we need are stack operations

Maximal size of the lists we deal with is relatively small

List are generally not suitable:

For random access

For set operations sucha as union and intersection

To deal with (really) large amount of text via String

Map

Implementation based upon binary search trees

Available in Data.Map

Key in the tree are ordered, so that efficient lookup is possible

Requires the keys to be in Ord

Inserting and removing elements can trigger rotations to rebalance the tree

Everything happens in a persistent setting

code:interface.hs

data Map k a -- abstract

empty :: Map k a -- O(1)

insert :: (Ord k) => k -> a -> Map k a -> Map k a -- O(log n)

lookup :: (Ord k) => k -> Map k a -> Maybe a -- O(log n)

delete :: (Ord k) => k -> Map k a -> Map k a -- O(log n)

update :: (Ord k) => (a -> Maybe a) -> k -> Map k a -> Map k a -- O(log n)

union :: (Ord k) => Map k a -> Map k a -> Map k a -- O(m + n)

member :: (Ord k) => k -> Map k a -> Bool -- O(log n)

size :: Map k a -> Int -- O(1)

map :: (a -> b) -> Map k a -> Map k b -- O(n)

Sets

Sets are special case of finite maps: essentially,

type Set a = Map a ()

A specialized set implementation is available in Data.Set.

Sequences

Sometimes, we need a data structure with

Efficient random access to arbitrary elements

Very efficient access to both ends

Efficient concatenation and splitting

This is offered by the Data.Sequence library from containers package.

Sequences are implemented as an special form of trees called finger trees

code:interface.hs

data Seq a -- abstract

empty :: Seq a -- O(1)

(<|) :: a -> Seq a -> Seq a -- O(1)

(|>) :: Seq a -> a -> Seq a -- O(1)

(><) :: Seq a -> Seq a -> Seq a -- O(log(min(m, n)))

null :: Seq a -> Bool -- O(1)

length :: Seq a -> Int -- O(1)

filter :: (a -> Bool) -> Seq a -> Seq a -- O(n)

fmap :: (a -> b) -> Seq a -> Seq b -- O(n)

index :: Seq a -> Int -> a -- O(log(min(i, n - 1)))

splitAt :: Seq a -> Int -> (Seq a, Seq a) -- O(log(min(i, n- i)))

Vector

Data.Vector is offered from vector library.

vector offers efficient slicing

It additionally stores a lower and upper bound. These are used to implement the slicing operations.

code:interface.hs

data Vector a -- abstract

empty :: Vector a -- O(1)

generate :: Int -> (Int -> a) -> Vector a -- O(n)

fromList :: a -> Vector a -- o(n)

(++) :: Vector a -> Vector a -> Vector a -- O(m + n)

(!) :: Vector a -> Int -> a -- O(1)

(!?) :: Vector a -> Int -> Maybe a -- O(1)

slice :: Int -> Int -> Vector a -> Vector -- O(1)

(//) :: Vector a -> (Int, a) -> Vector a -- O(m + n)

map :: (a -> b) -> Vector a -> Vector b -- O(n)

filter :: (a -> Bool) -> Vector a -> Vector a -- O(n)

foldr :: (a -> b -> b) -> b -> Vector a -> b -- O(n)

(Important) Advice on persistent arrays

Stay away if you require a large number of small incremental updates. Finite maps or sequences are much more suited

Arrays can be useful if you have an essentially constant table that you need to access frequently.

Array can also be useful if you perform global updates on them anyway.

Unboxed Vector

Data.Vector.Unboxed is similar to Data.Vector

code:interface.hs

data Vector a -- abstract

empty :: Unbox a => Vector a -- O(1)

generate :: Unbox a => Int -> (Int -> a) -> Vector a -- O(n)

fromList :: Unbox a => a -> Vector a -- O(n)

(++) :: Unbox a => Vector a -> Vector a -> Vector a -- O(m + n)

(!) :: Unbox a => Vector a -> Int -> a -- O(1)

(!?) :: Unbox a => Vector a -> Int -> Maybe a -- O(1)

slice :: Unbox a => Int -> Int -> Vector a -> Vector -- O(1)

(//) :: Unbox a => Vector a -> (Int, a) -> Vector a -- O(m + n)

map :: (Unbox a, Unbox b) => (a -> b) -> Vector a -> Vector b -- O(n)

filter :: Unbox a => (a -> Bool) -> Vector a -> Vector a -- O(n)

foldr :: Unbox a => (a -> b -> b) -> b -> Vector a -> b -- O(n)

Mutable vectors

Some algorithms can be implemented more efficiently in the presence of destructive updates

Haskell has mutable data structures as well as immutable ones

Most operations on mutable data structures have IO type. (e.g IORef, TVar, MVar)

Both Data.Vector.Mutable and Data.Vector.Unboxed.Mutable exports a datatype

data IOVector (a :: *) abstract

code:interface.hs

new :: Int -> IO (IOVector a)

replicate :: Int -> a -> IO (IOVector a)

read :: IOVector a -> Int -> IO a

write :: IOVector a -> Int -> a -> IO ()

https://gyazo.com/8eee7138bb34d79e3ea0bd36e4c72e94

#Strings

#Merkle_tree

#Binary_Serialization