Haskell: write yourself a Monad in almost no time (2)

If you came here without having read part1 you may wish to do so now.

How can we use our parser combinator function chain to construct more sophisticated parsers?

Well, let’s just try to build a simple parser chain:
chain1 = chain item (\a -> item)

While this will indeed parse the original input sequentially (you can easily verify with parse chain1 “123”), it will throw away the first result, namely a.

Something more must be done: we must explicitly combine the results of both sub-parses. Therefore we need explicit access to the result of the second parse, too. Thus:

chain2 = chain item (\a ->
   chain item (\b ->
      inject [a,b]))

[The complete code so far can be downloaded here.]

“But where are the Monads?” you may ask. Well, let’s just turn our whole parser into one. Very little code is needed. In fact just:

instance Monad Parser where
   return = inject
   (>>=) = chain

So, take a deep breath and read on

Advertisements
Posted in Uncategorized | 2 Comments

Haskell: write yourself a Monad in almost no time (3)

If you came here without having read any of the previous posts you may wish to do so here and here.

We just turned our parser into a Monad by adding

instance Monad Parser where
   return = inject
   (>>=) = chain

To see the benefit this gives us let’s return to our parser chain.
We can now express it like this:

mchain1 = do
   a <- item
   b <- item
   return [a,b]

Previously we had written

chain2 = chain item (\a ->
   chain item (\b ->
      inject [a,b]))

We still use the same variables and the same result, but the chaining itself actually got simpler. And the longer the chain, the more we gain with the Monad’s machinery.
And all this without any magic. By saying

instance Monad Parser where
   return = inject
   (>>=) = chain

we turned our Parser into a Monad. All previous code will still work, but in addition to that we can now use the syntactic sugar that Monads provide (do-notation).

Before we analyze that sugar in more detail let’s first have a look at the Monad instantiation of Parser. A Monad instance must provide at least two functions, return and >>=. Since our parser already came equipped with everything necessary, we just had to tell GHC (or whatever you’re using), that it should use the appropriate methods: chain as our implementation of >>= and inject as implementation of return.

While inject is rather straightforward (it just returns its argument as the result of the parse and doesn’t touch the input), chain may eventually require a little more thought to understand. The most interesting part is actually the last line:
(Just a, x) -> parse (f a) x

In case the first parse succeeded we have its result in a and the remaining input in x.
Executing f on the result gives us the parser we then use to parse the remaining input.

So, without using the Monad’s machinery we essentially had to manually build a function f that would give us a parser doing the second parse and combining it with the result of the first parse.
When using the Monad machinery we just make two parses (they’re chained implicitly by the Monad) and return the combined result. No manual function generation!

Note that the values are sort of “extracted” from the parser monad: our chain function determines the value to be “extracted”. The ParseState remains hidden behind the scenes. The return then wraps our values again.
Note, too, that pattern matching can extract the value in very much the same way.
Instead of
parse (do a <- item; "-> use value of 'a' <-") input
you can write
case parse item input of (Just a, _) -> "-> use value of 'a' <-"
In both cases will the parse value be bound to a.

Let’s now make the Monad’s syntactic sugar more explicit:

mchain1 = do
   a <- item
   b <- item
   return [a,b]

is the same as

mchain2 = do
   item >>= (\ a ->
      item >>= (\b ->
         return [a,b]))

which is the same as (substituting our own function names):

mchain3 = do
   item `chain` (\a ->
      item `chain` (\b ->
         inject [a,b]))

which is identical to

mchain4 = do
   chain item (\a ->
      chain item (\b ->
         inject [a,b]))

Compare this to

chain2 =
   chain item (\a ->
      chain item (\b ->
         inject [a,b]))

🙂

[The code for this part can be downloaded here.]

Posted in Uncategorized | 1 Comment

Haskell: write yourself a Monad in almost no time

Having read a couple of messages about the difficulty to learn about Monads on the Haskell Beginners mailing list, I thought I’d give it a try and explain my experience. It might help someone with his/her learning experience.

A word of caution: being a beginner in Haskell myself I do not claim to have the widest of horizons talking about Monads, in fact it’s still rather limited. But I do have a very fresh memory of where the difficulties were for me. Might be a plus. But well, let’s just get started…

Heavily inspired by “Graham Hutton: Programming in Haskell” and “Bryan O’Sullivan, John Goerzen & Don Stewart: Real World Haskell” this will be about a Parser.

The type definitions:
type ParseState = String
newtype Parser a = P (ParseState -> (Maybe a, ParseState))

That is: a ParseState is just a String (the remaining input to parse). A Parser is a function from a ParseState to a pair of Maybe a (the result) and another ParseState (the input left to parse).

It is now easy to write a function parse that drives parsing:

parse :: Parser a -> ParseState -> (Maybe a, ParseState)
parse (P p) ps = p ps

parse takes a Parser and a ParseState and returns a pair of Maybe a (the result) and another ParseState (the input left to parse). Exactly as we wanted – see above.

Ok, now let’s write three simple parsers. One to handle failures, one to yield a constant result and one that simply returns the first character of the parse state as result.

-- fail a little more gracefully then just calling 'error'
failure s = P (\inp -> (Nothing, ""))

-- return parameter as parse result without touching the parser's input
inject :: a -> Parser a
inject s = P (\inp ->(Just s, inp))

-- just return first item of parse input
item :: Parser Char
item = P (\inp -> case inp of
   [] -> (Nothing, [])
   (s:ss) -> (Just s, ss))

So far we can do simple parsing, like for examle for the first item:
parse item "123"

In order to build more interesting parsers from these simple building blocks, however, we need the possibility to combine parsers. Two useful combinations are sequential composition (chaining parsers) and alternative parsing (use either parser1 or – if that fails – parser2).
Let’s write two functions that allow just that:

-- combine two parsers into a sequential one
-- Note that the second argument is a function, *not* a parser, that is
-- supposed to return a parser which can/should consider it's argument
-- (the result of the first parse).
chain :: Parser a -> (a -> Parser b) -> Parser b
chain pa f = P resultParser
where
resultParser input = case parse pa input of
   (Nothing, x) -> (Nothing, input)
   (Just a, x) -> parse (f a) x


-- choice: parse input with 'p1' and return result or, if the parse failed then
-- parse with 'p2' and return result no matter what
por :: Parser a -> Parser a -> Parser a
por p1 p2 = P (\inp -> case parse p1 inp of
   (Nothing, x) -> parse p2 inp
   x -> x)

While this gives us already everything we need to build rather sophisticated parsers, we still need to understand how to best make use of these building blocks.

So let’s continue here…

Posted in Uncategorized | 2 Comments