Haskell/Preliminaries

(All the examples in this chapter can be typed into a Haskell source file and evaluated by loading that file into GHC or Hugs.)

Variables that Don't and Functions that DoEdit

One of the main ideas behind functional programming is that of referential transparency. This refers to the fact that a function and a parameter can always be replaced by the function's return value: for example, 2+2 is (almost) completely indistinguishable from 4. This facilitates a number of very important concepts in Haskell, like lazy evaluation and the ability to reorder expressions to execute them concurrently.

The above aren't the only reasons why this is done, but you'll find out more about them as you read on about Haskell.

Haskell variables, unlike those in imperative languages such as C++, are not memory locations. They are bound to a particular value for their lifetimes and are more like constants than variables in most cases. This is the same as the idea of a variable in math. In the following example, x is bound to 3; it can't possibly change.

x = 6 / 2

This may seem extremely weird if you've never seen it before. You're also most likely asking yourself, how can I write useful programs without actual variables? Well, Haskell's function parameters can serve the same purpose: once you pass a value to a function, that function can use it wherever it wants. To change the value of a parameter, just call the function again. So the answer is recursion.

Here are some Haskell declarations:

x = 3
y = x * 2

As you've probably guessed, by the end y = 6. Looks quite a bit like math doesn't it?

However, unlike a conventional language, the order doesn't matter. Haskell doesn't have a notion of "flow of control", so it doesn't start executing at the top and working its way down. So this means exactly the same as the above:

y = x * 2
x = 3

You can of course also define functions. This is where life starts to get a little more interesting.

double x = x * 2
z = double y

This says that the function "double" takes an argument "x" and the result is equal to twice x. Note the lack of parentheses: in Haskell "double y" means apply the function "double" to the value "y". Parentheses are only needed to determine evaluation priority, just like in math.

So far this lets you do about as much as a spreadsheet would: variables in this world look a lot like cells in a spreadsheet, and you can set up functions that can "do" things with these variables.

Now look at this function:

factorial 0 = 1
factorial n = n * factorial (n-1)

This defines a new function called "factorial". The first line says that the factorial of 0 is 1, and the second one says that the factorial of any other number "n" is equal to "n" times the factorial of "n-1". Note the parentheses around the "n-1": without them this would have been parsed as "(factorial n) - 1"; function application (as this process is known) has higher precedence than any regular operator like addition or exponentiation.

This shows how you do "loops" in Haskell. You can't have a loop counter like "n = n - 1" because that would change the value of "n". So instead you use recursion.

Unlike the other example, the order of the two recursive declarations is important. Haskell matches function calls starting at the top and picking the first one that matches. So if the two declarations were reversed then the compiler would conclude that factorial 0 equals 0 * factorial -1, and so on to infinity. Not what we want.

Tail RecursionEdit

In case you're wondering about stack space, won't all this recursion eventually overflow the stack? The answer is that Haskell supports tail recursion (every decent functional language's compiler does). Tail recursion is the idea that if the last thing a function does is call itself, the interpreter may as well not bother putting things on the stack, and just "jump" to the beginning using the function's new arguments. In this case, though, the factorial function has to call itself with "(n-1)" and then multiply the result by "n".

So our factorial function isn't tail recursive (because it has to multiply result by n after calling itself). This one is, though:

factorial n = factorial' n 1

factorial' 0 a = a
factorial' n a = factorial' (n-1) (n*a)

Here are two functions. They have almost the same name, except the second has a tick mark. (Haskell identifiers are allowed to contain tick marks.) By convention, it is used to indicate a variation on something.

Here, the factorial' function takes an argument "n" as before, but it also takes a second "accumulator" argument. Because calling itself is the last thing it does the tail recursion rule applies: nothing gets put on the stack; the factorial' function simply executes as if its arguments had always been n-1 and n*a.

(Note the similarity between this parameter and a variable. Parameters are, in a way, the variables of Haskell.)

A programmer using this function doesn't want to worry about the accumulator argument, so the plain factorial function is used to "wrap up" factorial' by supplying the initial value of the accumulator.

ExercisesEdit

Type the factorial function into a Haskell source file and load it into your favourite Haskell environment.

What is factorial 5?

What about factorial 1000? Is that what you expected?

What about factorial -1?

TypesEdit

You may have noticed something interesting above. None of your parameters or anything was declared to be of any particular type.

That's because Haskell uses a type system to infer the type of every expression, the Hindley-Milner type system. But Haskell is not weakly typed; it's strongly typed like C++, so you can't get runtime type errors. Any program that compiles is guaranteed to not have type errors in it. Another advantage is that when you make a mistake, it will almost always change the type of the resulting expression, and it'll be flagged by the compiler as such. (Your program simply won't compile.) You may still want to write types for your functions for a few reasons.

  1. It helps to document your program.
  2. When you get something wrong, the type declarations help to narrow down the source of the problem.
  3. Most importantly, it will help you learn. Understanding the type system is fundamental to getting a good grasp on Haskell.

The syntax for type declarations looks like this:

x :: Integer
factorial :: Integer -> Integer

You should read the first declaration as saying "x is of type Integer" or "x has type Integer". The '::' tells us we're referring to a type. You'll note factorial looks a bit different. It's a function that takes an Integer and returns an Integer. Later on you'll see why we use -> in describing the types, and why describing the type of a function is so close syntactically to describing the type of a variable.

One final note: identifiers that start with uppercase letters are reserved for types, modules and data constructors (somewhat like namespaces, don't worry about these for now). If you try to declare a variable or function that begins with an uppercase letter, the compiler will complain.

Last modified on 28 October 2009, at 18:12