Haskell/Alternative and MonadPlus
In our studies so far, we saw that both Maybe
and lists can represent computations with a varying number of results. We use Maybe
to indicate a computation can fail somehow (that is, it can have either zero results or one result), and we use lists for computations that can have many possible results (ranging from zero to arbitrarily many results). In both of these cases, one useful operation is amalgamating all possible results from multiple computations into a single computation. With lists, for instance, that would amount to concatenating lists of possible results. The Alternative
class captures this amalgamation in a general way.
Definition
editNote
The Alternative
class and its methods can be found in the Control.Applicative module.
Alternative
is a subclass of Applicative
whose instances must define, at a minimum, the following two methods:
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
empty
is an applicative computation with zero results, while (<|>)
is a binary function which combines two computations.
Here are the two instance definitions for Maybe
and lists:
instance Alternative Maybe where
empty = Nothing
-- Note that this could have been written more compactly.
Nothing <|> Nothing = Nothing -- 0 results + 0 results = 0 results
Just x <|> Nothing = Just x -- 1 result + 0 results = 1 result
Nothing <|> Just x = Just x -- 0 results + 1 result = 1 result
Just x <|> Just y = Just x -- 1 result + 1 result = 1 result:
-- Maybe can only hold up to one result,
-- so we discard the second one.
instance Alternative [] where
empty = []
(<|>) = (++) -- length xs + length ys = length (xs ++ ys)
Example: parallel parsing
editTraditional input parsing involves functions which consume an input one character at a time. That is, a parsing function takes an input string and chops off (i.e. "consumes") characters from the front if they satisfy certain criteria. For example, you could write a function which consumes one uppercase character. If the characters on the front of the string don't satisfy the given criteria, the parser has failed. In the example below, for instance, we consume a digit in the input and return the digit that was parsed. The possibility of failure is expressed by using Maybe
.
digit i (c:_)
| i > 9 || i < 0 = Nothing
| otherwise =
if [c] == show i then Just i else Nothing
The guards assure that the Int
we are checking for is a single digit. Otherwise, we are just checking that the first character of our String matches the digit we are checking for. If it passes, we return the digit wrapped in a Just
. Otherwise we return Nothing
.
Now, (<|>)
can be used to run two parsers in parallel. That is, we use the result of the first one if it succeeds, and otherwise, we use the result of the second. If both fail, then the combined parser returns Nothing
. We can use digit
with (<|>)
to, for instance, parse strings of binary digits:
binChar :: String -> Maybe Int
binChar s = digit 0 s <|> digit 1 s
Parser libraries often make use of Alternative
in this way. Two examples are (+++)
in Text.ParserCombinators.ReadP and (<|>)
in Text.ParserCombinators.Parsec.Prim. This usage pattern can be described in terms of choice. For instance, if we want to give binChar
a string that will be successfully parsed, we have two choices: either to begin the string with '0'
or with '1'
.
MonadPlus
editThe MonadPlus
class is closely related to Alternative
:
class Monad m => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
Its definition is the same as Alternative
, except for different method names and the Applicative
constraint being changed into Monad
. Unsurprisingly, for types that have instances of both Alternative
and MonadPlus
, mzero
and mplus
should be equivalent to empty
and (<|>)
respectively.
One might legitimately wonder why the seemingly redundant MonadPlus
class exists. Part of the reason is historical: just like Monad
existed in Haskell long before Applicative
was introduced, MonadPlus
is much older than Alternative
. Beyond such accidents, there are additional expectations (ones that do not apply to Alternative
) about how the MonadPlus
methods should interact with the Monad
, and therefore indicating that something is a MonadPlus
is a stronger claim than indicating that it is both an Alternative
and a Monad
. We will make some additional considerations about this issue in the following section.
Alternative and MonadPlus laws
editLike most general-purpose classes, Alternative
and MonadPlus
are expected to follow a handful of laws. However, there isn't universal agreement on what the full set of laws should look like. The most commonly adopted laws, and the most crucial for providing intuition about Alternative
, say that empty
and (<|>)
form a monoid. By that, we mean:
-- empty is a neutral element
empty <|> u = u
u <|> empty = u
-- (<|>) is associative
u <|> (v <|> w) = (u <|> v) <|> w
There is nothing fancy about "forming a monoid": in the above, "neutral element" and "associative" here is just like how addition of integer numbers is said to be associative and to have zero as its neutral element. In fact, this analogy is the source of the names of the MonadPlus
methods, mzero
and mplus
.
As for MonadPlus
, at a minimum there usually are the monoid laws, which correspond exactly to the ones just above...
mzero `mplus` m = m
m `mplus` mzero = m
m `mplus` (n `mplus` o) = (m `mplus` n) `mplus` o
... plus the additional two laws, quoted by the Control.Monad documentation:
mzero >>= f = mzero -- left zero
m >> mzero = mzero -- right zero
If mzero
is interpreted as a failed computation, these laws state that a failure within a chain of monadic computations leads to the failure of the whole chain.
We will touch upon some additional suggestions of laws for Alternative
and MonadPlus
at the end of the chapter.
Useful functions
editIn addition to (<|>)
and empty
, there are two other general-purpose functions in the base libraries involving Alternative
.
asum
editA common task when working with Alternative
is taking a list of alternative values, e.g. [Maybe a]
or [[a]]
, and folding it down with (<|>)
. The function asum
, from Data.Foldable
fulfills this role:
asum :: (Alternative f, Foldable t) => t (f a) -> f a
asum = foldr (<|>) empty
In a sense, asum
generalizes the list-specific concat
operation. Indeed, the two are equivalent when lists are the Alternative
being used. For Maybe, asum
finds the first Just x
in the list and returns Nothing
if there aren't any.
It should also be mentioned that msum
, available from both `Data.Foldable` and `Control.Monad`, is just asum
specialised to MonadPlus
.
msum :: (MonadPlus m, Foldable t) => t (m a) -> m a
guard
editWhen discussing the list monad we noted how similar it was to list comprehensions, but we didn't discuss how to mirror list comprehension filtering. The guard
function from Control.Monad
allows us to do exactly that.
Consider the following comprehension which retrieves all pythagorean triples (i.e. trios of integer numbers which work as the lengths of the sides for a right triangle). First we'll examine the brute-force approach. We'll use a boolean condition for filtering; namely, Pythagoras' theorem:
pythags = [ (x, y, z) | z <- [1..], x <- [1..z], y <- [x..z], x^2 + y^2 == z^2 ]
The translation of the comprehension above to a list monad do-block is:
pythags = do
z <- [1..]
x <- [1..z]
y <- [x..z]
guard (x^2 + y^2 == z^2)
return (x, y, z)
The guard
function can be defined for all Alternative
s like this:
guard :: Alternative m => Bool -> m ()
guard True = pure ()
guard _ = empty
guard
will reduce a do-block to empty
if its predicate is False
. Given the left zero law...
mzero >>= f = mzero
-- Or, equivalently:
empty >>= f = empty
... an empty
on the left-hand side of an >>=
operation will produce empty
again. As do-blocks are decomposed to lots of expressions joined up by (>>=)
, an empty
at any point will cause the entire do-block to become empty
.
Let's examine in detail what guard
does in the pythags
. First, here is guard
defined for the list monad:
-- guard :: Bool -> [()]
guard True = [()]
guard _ = []
Basically, guard
blocks off a route. In pythags
, we want to block off all the routes (or combinations of x
, y
and z
) where x^2 + y^2 == z^2
is False
. Let's look at the expansion of the above do
-block to see how it works:
pythags =
[1..] >>= \z ->
[1..z] >>= \x ->
[x..z] >>= \y ->
guard (x^2 + y^2 == z^2) >>= \_ ->
return (x, y, z)
Replacing >>=
and return
with their definitions for the list monad (and using some let-bindings to keep it readable), we obtain:
pythags =
let ret x y z = [(x, y, z)]
gd z x y = concatMap (\_ -> ret x y z) (guard $ x^2 + y^2 == z^2)
doY z x = concatMap (gd z x) [x..z]
doX z = concatMap (doY z ) [1..z]
doZ = concatMap (doX ) [1..]
in doZ
Remember that guard
returns the empty list in the case of its argument being False
. Mapping across the empty list produces the empty list, no matter what function you pass in. So an empty list produced by the call to guard
in gd
will cause gd
to produce an empty list, with \_ -> ret x y z
, which would otherwise add a result, not being actually called.
To understand why this matters, think about list-computations as a tree. With our Pythagorean triple algorithm, we need a branch starting from the top for every choice of z
, then a branch from each of these branches for every value of x
, then from each of these, a branch for every value of y
. So the tree looks like this:
start |_________________________... | | | z 1 2 3 | |____ |____________ | | | | | | x 1 1 2 1 2 3 | |_ | |___ |_ | | | | | | | | | | | y 1 1 2 2 1 2 3 2 3 3
Each combination of z, x and y represents a route through the tree. Once all the functions have been applied, the results of each branch are concatenated together, starting from the bottom. Any route where our predicate doesn't hold evaluates to an empty list, and so has no impact on this concatenation.
Exercises
editExercises |
---|
|
Relationship with monoids
editWhile discussing the Alternative
laws above, we alluded to the mathematical concept of monoids. There is in fact already a Monoid
class in Haskell (defined in Data.Monoid). A thorough presentation of monoid will be given in a later chapter. However for now it suffices to say that a minimal definition of Monoid
implements two methods; namely, a neutral element (or 'zero') and an associative binary operation (or 'plus').
class Monoid m where
mempty :: m
mappend :: m -> m -> m
For example, lists form a simple monoid:
instance Monoid [a] where
mempty = []
mappend = (++)
Looks familiar, doesn't it? In spite of the uncanny resemblance to Alternative
and MonadPlus
, there is a key difference. Note the use of [a]
instead of []
in the instance declaration. Monoids are not necessarily "wrappers" of anything, or parametrically polymorphic. For instance, the integer numbers form a monoid under addition with 0
as neutral element. Alternative
is a separate type class because it captures a specific sort of monoid with distinctive properties − for instance, a binary operation (<|>) :: Alternative f => f a -> f a -> f a
that is intrinsically linked to an Applicative
context.
Other suggested laws
editNote
Consider this as a bonus section. While it is good to be aware of there being various takes on these laws, the whole issue is, generally speaking, not worth losing sleep over.
Beyond the commonly assumed laws mentioned a few sections above, there are a handful of others which make sense from certain perspectives, but do not hold for all existing instances of Alternative
and MonadPlus
. The current MonadPlus
, in particular, might be seen as an intersection between a handful of hypothetical classes that would have additional laws.
The following two additional laws are commonly suggested for Alternative
. While they do hold for both Maybe
and lists, there are counterexamples in the core libraries. Also note that, for Alternative
s that are also MonadPlus
, the mzero
laws mentioned earlier are not a consequence of these laws.
(f <|> g) <*> a = (f <*> a) <|> (g <*> a) -- right distributivity (of <*>)
empty <*> a = empty -- right absorption (for <*>)
As for MonadPlus
, a common suggestion is the left distribution law, which holds for lists, but not for Maybe
:
(m `mplus` n) >>= k = (m >>= k) `mplus` (n >>= k) -- left distribution
Conversely, the left catch law holds for Maybe
but not for lists:
return x `mplus` m = return x -- left catch
It is generally assumed that either left distribution or left catch will hold for any MonadPlus
instance. Why not both? Suppose they both hold. Then for any x, y :: m a
,
x `mplus` y
= -- monad identity
(return x >>= id) `mplus` (return y >>= id)
= -- left distribution
(return x `mplus` return y) >>= id
= -- left catch
return x >>= id
= -- monad identity
x
This immediately rules out all but the most trivial MonadPlus
implementation. Even worse, it implies that for any x
, mzero `mplus` x = mzero
. Adding the monoid identity law mzero `mplus` x = x
then implies that the monad has only one value, and is thus isomorphic to the trivial monad Data.Proxy.Proxy
.
Finally, it is worth noting that there are divergences even about the monoid laws. One case sometimes raised against them is that for certain non-determinism monads typically expressed in terms of MonadPlus
the key laws are left zero and left distribution, while the monoid laws in such cases lead to difficulties and should be relaxed or dropped entirely.
Some entirely optional further reading, for the curious reader:
- The Haskell Wiki on MonadPlus (note that this debate long predates the existence of
Alternative
). - Distinction between typeclasses MonadPlus, Alternative, and Monoid? and Confused by the meaning of the 'Alternative' type class and its relationship to other type classes at Stack Overflow (detailed overviews of the status quo reflected by the documentation of the relevant libraries as of GHC 7.x/8.x − as opposed to the 2010 Haskell Report, which is less prescriptive on this matter.)
- From monoids to near-semirings: the essence of MonadPlus and Alternative by Rivas, Jaskelioff and Schrijvers (a formulation that includes, beyond the monoid laws, right distribution and right absorption for
Alternative
, as well as left zero and left distribution forMonadPlus
). - Wren Romano on MonadPlus and seminearrings (argues that the
MonadPlus
right zero law is too strong). - Oleg Kiselyov on the MonadPlus laws (argues against the monoid laws in the case of non-determinism monads).
- Must mplus always be associative? at Stack Overflow (a discussion about the merits of the monoid laws of
MonadPlus
).