Last modified on 17 May 2014, at 17:21

Haskell/Building vocabulary

This chapter will be somewhat different from the surrounding ones. Think of it as an interlude, where the main goal is not to introduce new features, but to present advice for studying (and using!) Haskell. Here, we will discuss the importance of acquiring a vocabulary of functions and how this book, along with other resources, can help you with that. First, however, we need to make a few quick points about function composition.

Function compositionEdit

Function composition is a really simple concept. It just means applying one function to a value and then applying another function to the result. Consider these two very simple functions:

Example: Simple functions

f x = x + 3
square x = x^2

We can compose them in two different ways, depending on which one we apply first:

Prelude> square (f 1)
16
Prelude> square (f 2)
25
Prelude> f (square 1)
4
Prelude> f (square 2)
7

The parentheses around the inner function are necessary; otherwise, the interpreter would think that you were trying to get the value of square f, or f square; and both have no meaning.

The composition of two functions results in a function in its own right. If applying f and then square, or vice-versa, to a number were a frequent, meaningful or otherwise important operations in a program, a natural next step would be defining:

Example: Composed functions

squareOfF x = square (f x)
 
fOfSquare x = f (square x)

There is a second, nifty way of writing composed functions. It uses (.), the function composition operator, and is as simple as putting a period between the two functions:

Example: Composing functions with (.)

squareOfF x = (square . f) x
 
fOfSquare x = (f . square) x

Note that functions are still applied from right to left, so that g(f(x)) == (g . f) x [1].

The need for a vocabularyEdit

Function composition allows us to define complicated functions using simpler ones as building blocks. One of the key qualities of Haskell is how simple it is to write composed functions, no matter if the base functions are written by ourselves or by someone else,[2] and the extent that helps us in writing simple, elegant and expressive code.

In order to use function composition, we first need to have functions to compose. While naturally the functions we write ourselves will always be available, every installation of GHC comes with a vast assortment of libraries (that is, packaged code), which provide functions for various common tasks. For that reason, it is vital for effective Haskell programming to develop some familiarity with the essential libraries. At the very least, you should know how to find useful functions in the libraries when you need them.

We can look at this issue from a different perspective. We have gone through a substantial portion of the Haskell syntax already; and, by the time we are done with the upcoming Recursion chapter, we could, in principle, write pretty much any list manipulation program we want. However, writing full programs at this point would be terribly inefficient, mainly because we would end up rewriting large parts of the standard libraries. It is far easier to have the libraries deal as much as possible with trivial well-known problems while we dedicate our brain cells to solving the problems we are truly interested in. Furthermore, the functionality provided by libraries helps us in developing our own algorithms.[3]

Prelude and the librariesEdit

Here are a few basic facts about Haskell libraries:

First and foremost, there is Prelude, which is the core library loaded by default in every Haskell program. Alongside with the basic types, it provides a set of ubiquitous and extremely useful functions. We will refer to Prelude and its functions all the time throughout these introductory chapters.

Beyond Prelude, there is a set of core libraries, which provide a much wider range of tools. Although they are provided by default with GHC, they are not loaded automatically like Prelude. Instead, they are made available as modules, which must be imported into your program. Later on, we will explain the minutiae of how modules work. For now, just know that your source file needs lines near the top to import any necessary modules. For example, the function permutations is in the module Data.List, and, to import that, add the line import Data.List to the top of your .hs file. Here's how a full source file looks:

Example: Importing a module in a source file

import Data.List
 
testPermutations = permutations "Prelude"

For quick GHCi tests, just enter :m +Data.List at the command line to load that module.

Prelude> :m +Data.List
Prelude Data.List> :t permutations
permutations :: [a] -> [[a]]

One exhibitEdit

Before continuing, let us see one (slightly histrionic, we admit) example of what familiarity with a few basic functions from Prelude can bring us.[4] Suppose we need a function which takes a string composed of words separated by spaces and returns that string with the order of the words reversed, so that "Mary had a little lamb" becomes "lamb little a had Mary". Now, we can solve that problem using exclusively what we have seen so far about Haskell, plus a few insights that can be acquired by studying the Recursion chapter. Here is what it might look like. Don't stare at it for too long!

Example: There be dragons

monsterRevWords :: String -> String
monsterRevWords input = rejoinUnreversed (divideReversed input)
    where
    divideReversed s = go1 [] s
        where
        go1 divided [] = divided
        go1 [] (c:cs)
            | testSpace c = go1 [] cs
            | otherwise   = go1 [[]] (c:cs)
        go1 (w:ws) [c]
            | testSpace c = (w:ws)
            | otherwise   = ((c:w):ws)
        go1 (w:ws) (c:c':cs)
            | testSpace c =
                if testSpace c'
                    then go1 (w:ws) (c':cs)
                    else go1 ([c']:w:ws) cs
            | otherwise =
                if testSpace c'
                    then go1 ((c:w):ws) (c':cs)
                    else go1 ((c:w):ws) (c':cs)
    testSpace c = c == ' '
    rejoinUnreversed [] = []
    rejoinUnreversed [w] = reverseList w
    rejoinUnreversed strings = go2 (' ' : reverseList newFirstWord) (otherWords)
        where
        (newFirstWord : otherWords) = reverseList strings
        go2 rejoined ([]:[]) = rejoined
        go2 rejoined ([]:(w':ws')) = go2 (rejoined) ((' ':w'):ws')
        go2 rejoined ((c:cs):ws) = go2 (c:rejoined) (cs:ws)
    reverseList [] = []
    reverseList w = go3 [] w
        where
        go3 rev [] = rev
        go3 rev (c:cs) = go3 (c:rev) cs

There are too many problems with this thing; so let us consider just three of them:

  • If we claimed that monsterRevWords does what is expected, you could either take our word for it, test it exhaustively on all sorts of possible inputs or attempt to understand it and get an awful headache (please don't).
  • Furthermore, if we write a function this ugly and have to fix a bug or slightly modify it later on,[5] we are set for an awful time.
  • Finally, there is at least one easy to spot potential problem: if you have another glance at the definition, about halfway down there is a testSpace helper function which checks if a character is a space or not. The test, however, only includes the common space character (that is, ' '), and not other whitespace characters (tabs, newlines, etc.)[6].

Instead of the junk above, we can do much better by using the following Prelude functions:

  • words, which reliably breaks down a string in whitespace delimited words, returning a list of strings;
  • reverse, which reverses a list (incidentally, that is exactly what the reverseList above does); and
  • unwords, which does the opposite of words;

then function composition means our problem is instantly solved.

Example: revWords done the Haskell way

revWords :: String -> String
revWords input = (unwords . reverse . words) input

That's short, simple, readable and, since Prelude is reliable, bug-free.[7] So, any time some program you are writing begins to look like monsterRevWords, look around and reach for your toolbox - the libraries.

Acquiring vocabularyEdit

After the stern warnings above, you might expect us to continue with diving deep into the standard libraries. That is not the route we will follow, however - at least not in the first part of the book. The Beginner's Track is meant to cover most of the Haskell language functionality in a readable and reasonably compact account, and a linear, systematic study of the libraries would in all likelihood have us sacrificing either one attribute or the other.

In any case, the libraries will remain close at hand as we advance in the course (so there's no need for you to pause your reading just to study the libraries on your own). Here are a few suggestions on resources you can use to learn about them.

With this bookEdit

  • Once we enter Elementary Haskell, you will notice several of the exercises - mainly, among those about list processing - involve writing equivalent definitions for Prelude functions. For each of these exercises you do, one more function will be added to your repertoire.
  • Every now and then we will introduce more library functions; maybe within an example, or just with a mention in passing. Whenever we do so, take a minute to test the function and do some experiments. Remember to extend that habitual curiosity about types we mentioned in Type basics to the functions themselves.
  • While the first few chapters are quite tightly-knit, later parts of the book are more independent — especially the third track, Haskell in Practice. There, among other things, you can find chapters on the Hierarchical libraries; and most of their content can be understood soon after having completed Elementary Haskell.
  • As we reach the later parts of the Beginner's track, the concepts we will discuss (monads in particular) will naturally lead to exploration of important parts of the core libraries.

Other resourcesEdit

  • First and foremost, there is the documentation. While it is probably too dry to be really useful right now, it will prove valuable soon enough. You can read the Prelude specification on-line as well as the documentation of the libraries bundled with GHC, with nice navigation and source code just one click away.
  • Hoogle is a great way to search through the documentation. It is a Haskell search engine which covers the core libraries. You can search for everything from function names to type definitions and more.
  • Beyond the libraries included with GHC, there is a large ecosystem of libraries, made available through Hackage and installable with a tool called cabal. The Hackage site you can find documentation for the libraries available through it. We will not venture outside of the core libraries in the Beginner's Track; however, you will certainly be drawn to Hackage once you begin your own projects. A second Haskell search engine is Hayoo!; it covers all of Hackage.
  • When appropriate, we will give pointers to other useful learning resources, specially when we move towards intermediate and advanced topics.


NotesEdit

  1. (.) is modelled after the mathematical operator \circ, which works in the same way: (g \circ f)(x) = g(f(x))
  2. Such ease is not only due to the bits of syntax we mentioned above, but mainly due to features we will explain and discuss in depth later in the book - in particular, higher-order functions.
  3. One simple example is provided by functions like map, filter and the folds, which we will cover in the chapters on list processing just ahead. Another would be the various monad libraries, which will be studied in depth later on.
  4. The example here is inspired by the Simple Unix tools demo in the HaskellWiki.
  5. Co-author's note: "Later on? I wrote that half an hour ago and I'm not totally sure about how it works already..."
  6. A reliable way of checking whether a character is whitespace is with the isSpace function, which is in the module Data.Char.
  7. In case you are wondering, there are lots of other functions, either in Prelude or in Data.List which, in one way or another, would help to make monsterRevWords somewhat saner - just to name a few: (++), concat, groupBy, intersperse. There is no need for them in this case, though, since nothing compares to the one-liner above.