Binary Serialization
Serialization denotes the process of converting data into a format that can be written to disk or sent over a network connection.
Deserialization is the opposite process: (Re-)Constructing data from a file on disk or from something sent over the network.
There are several widely-used textual serialization formats.
XML and JSON are important in web technologies.
Another quite popular format, used for example for configuration files, is YAML, which is used for the Stack
configuration file stack.yaml.
CSV is often used for record-like data, for example to serialize spreadsheets or database tables.
Another important class of serialization formats are binary formats, represented by (lazy) Haskell ByteStrings.
Such binary representations can be much more compact and efficient than textual, human-readable ones.
There are several Haskell libraries that support binary serialization: binary, store, cereal...,
We will have a closer look at binary in this lecture.
Binary package
code:binary-class.hs
class Binary a where
put : a -> Put
get : Get a
encode :: Binary a => a -> ByteString
decode :: Binary a => ByteString -> a
decodeOrFail :: Binary a => ByteString ->
Either (ByteString, ByteOffset, String) (ByteString, ByteOffset, a)
-- primitives and helpers
putWord8 :: Word8 -> Put
getWord8 :: Get Word8
fail :: String -> Get a
encodeFile :: Binary a => FilePath -> a -> IO ()
decodeFileOrFail :: Binary a => FilePath -> IO (Either (ByteOffset, String) a)
Code example
code:example.hs
data Tree a = Leaf a | Node (Tree a) (Tree a)
instance Binary a => Binary (Tree a) where
put (Leaf a) = putWord8 0 >> put a
put (Node l r) = putWord8 1 >> put l >> put r
get = do
tag <- getWord8
case tag of
0 -> Leaf <$> get
1 -> Node <$> get <*> get
_ -> fail "not a tree"
type Result = Either (L.ByteString, ByteOffset, String) (L.ByteString, ByteOffset, Tree Char)
test1, test2 :: Result
test1 = decodeOrFail $ encode $ Node (Leaf 'x') (Leaf 'y')
test2 = decodeOrFail $ encode 'x'