Haskell/Monoids

      In earlier parts of the book, we have made a few passing allusions to monoids and the Monoid type class (most notably when discussing MonadPlus). Now it is time to give them a more detained look, and show what makes them useful.

      Introduction

      To put it bluntly, a monoid is a data type with an associative operation which combines two elements and a "zero" value which acts as neutral element for the operation. The Monoid captures this general concept:

      class Monoid a where 
        mempty  :: a
        mappend :: a -> a -> a
        mconcat :: [a] -> a
      

      The names of the class methods are highly suggestive; and indeed lists are a paradigmatic instance of Monoid, with [] as "zero"/mempty and (++) as mappend, the operation combining two elements. The third method, mconcat, is provided as bonus, and has a default implementation of mconcat = foldr mappend mempty which, naturally, is equivalent to concat for lists.

      It is legitimate to think of monoids as types which support appending, though some poetic license with respect to what is meant by "appending" is advisable, as the Monoid definition is extremely generic and not at all limited to data structures. Integer numbers, for instance, form a monoid with addition as "appending" and 0 as, well, zero:

      instance Monoid Integer where
        mempty = 0
        mappend = (+)
      

      Monoids also have laws, which the instances of Monoid must follow. They are really simple, though; corresponding to the neutral element and associativity properties we mentioned:

      mappend mempty x = x
      mappend x mempty = x
      mappend x (mappend y z) = mappend (mappend x y) z
      

      The Monoid class is defined in Data.Monoid, along with instances for many common types.

      Exercises
      1. There is a second possible instance of Monoid for Integer, using multiplication instead of addition as the combining operation. Implement that instance.[1]
      ↑Jump back a section

      Examples

      For a class with such a pompous name, monoids might seem just too simple, or just plain boring. Do not let any such impression mislead you, though: they have many interesting applications. We will mention some of them in this section.

      The Writer monad 
      A computation of type Writer w a returns a value of type a while producing accumulating output of type w. w is an instance of Monoid, and the bind operator of the monad uses mappend to accumulate the output. It is easy to picture w as a list type used for some kind of logging; we can use any monoid instance, though, which opens up many possibilities thanks to the generality of the class.
      The Foldable class 
      Foldable, available from Data.Foldable, provide a simple way to generalize list folding (as well as many related Data.List) functions to other data structures.[2] The implementations in Data.Foldable take advantage of monoids by generating monoidal values from the elements of the data structure, which can then be trivially combined with mappend. Another role of the Foldable is in the definition of the Traversable class (in Data.Traversable), which generalizes sequence from Control.Monad.
      Finger trees 
      Moving on from operations on data structures to data structure implementations, monoids can be used to implement finger trees, an efficient and very versatile data structure. Its implementation makes use of monoidal values as tags for the tree nodes; and different data structures - such as sequences, priority queues and search trees - can be obtained simply by changing the involved Monoid instance. This blog post, based on a paper by Ralf Hinze and Ross Patterson[3], contains a brief and accessible explanation on how monoids are used in finger trees.
      Options and settings
      In a wholly different context, monoids can be a handy way of treating application options and settings. Two examples are Cabal, the Haskell packaging system ("Package databases are monoids. Configuration files are monoids. Command line flags and sets of command line flags are monoids. Package build information is a monoid."[1]) and XMonad, a tiling window manager implemented in Haskell ("xmonad configuration hooks are monoidal."[2]).
      ↑Jump back a section

      Homomorphisms

      A function f :: a -> b between two monoids a and b is called a homomorphism if it preserves the monoid structure, so that:

      f mempty          = mempty
      f (x `mappend` y) = f x `mappend` f y
      

      For instance, length is an homomorphism between ([],++) and (Int,+)

      length []         = 0
      length (xs ++ ys) = length x + length y
      

      An interesting example "in the wild" of monoids and homomorphisms was identified by Chris Kuklewicz amidst the Google Protocol Buffers API documentation. Based on the quotes provided in the referenced comment, we highlight that the property that

          MyMessage message;
          message.ParseFromString(str1 + str2);
      

      is equivalent to

          MyMessage message, message2;
          message.ParseFromString(str1);
          message2.ParseFromString(str2);
          message.MergeFrom(message2);
      

      means that ParseFromString is a homomorphism. Translating it to Haskell and monoids, the following equations hold:

      parse :: String -> Message
      -- these are just equations, not actual code.
      parse []         = mempty
      parse (xs ++ ys) = parse xs `merge` parse ys
      

      Well, it doesn't hold perfectly, because parse can fail, but roughly so.

      ↑Jump back a section

      Further reading

      • Additional comments on Monoid usage in Cabal: [3]; [4].
      ↑Jump back a section

      Notes

      1. If you check the documentation for Data.Monoid, you will find that both instances are defined for types in Num, through the means of two newtype wrappers.
      2. Note that this kind of folding is different from the one discussed in Haskell/Other data structures.
      3. Ralf Hinze and Ross Paterson. Finger Trees: A Simple General-purpose Data Structure. Journal of Functional Programming 16, 2 (2006), pages 197-217.


      ↑Jump back a section
      Last modified on 4 January 2013, at 17:37