Lists III (Solutions) Elementary Haskell edit this chapter

## Folds

Like `map`, a fold is a higher order function that takes a function and a list. However, instead of applying the function element by element, the fold uses it to combine the list elements into a result value.

Let's look at a few concrete examples. `sum` could be implemented as:

Example: sum

```sum :: [Integer] -> Integer
sum []     = 0
sum (x:xs) = x + sum xs
```

and `product` as:

Example: product

```product :: [Integer] -> Integer
product []     = 1
product (x:xs) = x * product xs
```

`concat`, which takes a list of lists and joins (concatenates) them into one:

Example: concat

```concat :: [[a]] -> [a]
concat []     = []
concat (x:xs) = x ++ concat xs
```

All these examples show a pattern of recursion known as a fold. Think of the name referring to a list getting "folded up" into a single value or to a function being "folded between" the elements of the list.

Prelude defines four `fold` functions: `foldr`, `foldl`, `foldr1` and `foldl1`.

### foldr

The right-associative `foldr` folds up a list from the right to left. As it proceeds, foldr uses the given function to combine each of the elements with the running value called the accumulator. When calling foldr, the initial value of the accumulator is set as an argument.

```foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f acc []     = acc
foldr f acc (x:xs) = f x (foldr f acc xs)
```

The first argument to `foldr` is a function with two arguments. The second argument is value for the accumulator (which often starts at a neutral "zero" value). The third argument is the list to be folded.

In `sum`, `f` is `(+)`, and `acc` is `0`. In `concat`, `f` is `(++)` and `acc` is `[]`. In many cases (like all of our examples so far), the function passed to a fold will be one that takes two arguments of the same type, but this is not necessarily the case (as we can see from the `(a -> b -> b)` part of the type signature — if the types had to be the same, the first two letters in the type signature would have matched).

Remember, a list in Haskell written as `[a, b, c]` is an alternative (syntactic sugar) style for `a : b : c : []`.

Now, `foldr f acc xs` in the `foldr` definition simply replaces each cons (:) in the `xs` list with the function `f` while replacing the empty list at the end with `acc`:

```foldr f acc (a:b:c:[]) = f a (f b (f c acc))
```

Note how the parentheses nest around the right end of the list.

An elegant visualisation is given by picturing the list data structure as a tree:

```  :                         f
/ \                       / \
a   :       foldr f acc   a   f
/ \    ------------->     / \
b   :                     b   f
/ \                       / \
c  []                     c   acc
```

We can see here that `foldr (:) []` will return the list completely unchanged. That sort of function that has no effect is called an identity function. You should start building a habit of looking for identity functions in different cases, and we'll discuss them more later when we learn about monoids.

### foldl

The left-associative `foldl` processes the list in the opposite direction, starting at the left side with the first element.

```foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f acc []     =  acc
foldl f acc (x:xs) =  foldl f (f acc x) xs
```

So, parentheses in the resulting expression accumulate around the left end of the list:

```foldl f acc (a:b:c:[]) = f (f (f acc a) b) c
```

The corresponding trees look like:

```  :                            f
/ \                          / \
a   :       foldl f acc      f   c
/ \    ------------->    / \
b   :                    f   b
/ \                  / \
c  []                acc a
```

Because all folds include both left and right elements, beginners can get confused by the names. You could think of foldr as short for fold-right-to-left and foldl as fold-left-to-right. The names refer to where the fold starts.

Note

Technical Note: foldl is tail-recursive, that is, it recurses immediately, calling itself. For this reason the compiler will optimise it to a simple loop for efficiency. However, Haskell is a lazy language, so the calls to f will be left unevaluated by default, thus building up an unevaluated expression in memory that includes the entire length of the list. To avoid running out of memory, we have a version of foldl called `foldl'` that is strict — it forces the evaluation of f immediately at each step.

An apostrophe at the end of a function name is pronounced "tick" as in "fold-L-tick" or "prime" as in "fold-L-prime". A tick is a valid character in Haskell identifiers. `foldl'` can be found in the `Data.List` library module (imported by adding `import Data.List` to the beginning of a source file). As a rule of thumb, you should use `foldr` on lists that might be infinite or where the fold is building up a data structure and use `foldl'` if the list is known to be finite and comes down to a single value. There is almost never a good reason to use `foldl` (without the tick), though it might just work if the lists fed to it are not too long.

### Are foldl and foldr opposites?

You may notice that in some cases `foldl` and `foldr` do not appear to be opposites. Let's examine one such case, involving subtraction as the combining operation. Will we get `True` or `False` for each of the equalities below?

```foldl (-) 6 [3, 2, 1] == 6 - 3 - 2 - 1
foldr (-) 6 [1, 2, 3] == 6 - 3 - 2 - 1
```

Thinking of `foldr` as going from right to left might lead us to suppose that the second equality would be true, as the rightmost elements show up before the leftmost ones. That, however, is not what we actually see:

```GHCi> foldl (-) 6 [3, 2, 1] == 6 - 3 - 2 - 1
True
GHCi> foldr (-) 6 [1, 2, 3] == 6 - 3 - 2 - 1
False
```

This happens because the difference between `foldr` and `foldl` lies in the way the resulting expression is associated, and not in the left-to-right order of the elements of the list. Here is the expansion of the expressions above, with explicit parentheses:

```foldl (-) 6 [3, 2, 1] == ((6 - 3) - 2) - 1 -- True
foldr (-) 6 [1, 2, 3] == 1 - (2 - (3 - 6)) -- True
```

Also note how the initial accumulator (`6` in this example) is always found in the innermost parentheses.

For the sake of contrast, here is a simulated imperative style that does change the order of the elements in the list: 

```GHCi> import Data.List -- So that we can give foldl' a try
GHCi> reduce      f acc list = foldl' f acc list -- This is the same as plain old foldl'
GHCi> reduceRight f acc list = foldl' f acc (reverse list) -- keep foldl', but reverse list
```

Now we get `True` in both cases from our initial example, because both are folding from the left:

```GHCi> reduce      (-) 6 [3, 2, 1] == 6 - 3 - 2 - 1
True
GHCi> reduceRight (-) 6 [1, 2, 3] == 6 - 3 - 2 - 1
True
```

### foldr1 and foldl1

As previously noted, the type declaration for `foldr` makes it quite possible for the list elements and result to be of different types. For example, "read" is a function that takes a string and converts it into some type (the type system is smart enough to figure out which one). In this case we convert it into a float.

Example: The list elements and results can have different types

```addStr :: String -> Float -> Float

sumStr :: [String] -> Float
```

There is also a variant called `foldr1` ("fold - R - one") which dispenses with an explicit "zero" for an accumulator by taking the last element of the list instead:

```foldr1           :: (a -> a -> a) -> [a] -> a
foldr1 f [x]     =  x
foldr1 f (x:xs)  =  f x (foldr1 f xs)
foldr1 _ []      =  error "Prelude.foldr1: empty list"
```

And `foldl1` as well:

```foldl1           :: (a -> a -> a) -> [a] -> a
foldl1 f [x]     =  x
foldl1 f (x:xs)  =  foldl f x xs
foldl1 _ []      =  error "Prelude.foldl1: empty list"
```

Note: Just like for foldl, the Data.List library includes foldl1' as a strict version of foldl1.

With foldl1 and foldr1, all the types have to be the same, and an empty list is an error. These variants are useful when there is no obvious candidate for the initial accumulator value and we are sure that the list is not going to be empty. When in doubt, stick with foldr or foldl'.

### folds and laziness

One reason that right-associative folds are more natural in Haskell than left-associative ones is that right folds can operate on infinite lists. A fold that returns an infinite list is perfectly usable in a larger context that doesn't need to access the entire infinite result. In that case, foldr can move along as much as needed and the compiler will know when to stop. However, a left fold necessarily calls itself recursively until it reaches the end of the input list (because the recursive call is not made in an argument to f). Needless to say, no end will be reached if an input list to foldl is infinite.

As a toy example, consider a function `echoes` that takes a list of integers and produces a list such that wherever the number n occurs in the input list, it is replicated n times in the output list. To create our echoes function, we will use the prelude function `replicate` in which `replicate n x` is a list of length n with x the value of every element.

We can write echoes as a foldr quite handily:

```echoes :: [Int] -> [Int]
echoes = foldr (\ x xs -> (replicate x x) ++ xs) []
```
```GHCi> take 10 (echoes [1..])
[1,2,2,3,3,3,4,4,4,4]
```

(Note: This definition is compact thanks to the `\ x xs ->` syntax. The `\`, meant to look like a lambda (λ), works as an unnamed function for cases where we won't use the function again anywhere else. Thus, we provide the definition of our one-time function in situ. In this case, `x` and `xs` are the arguments, and the right-hand side of the definition is what comes after the `->`.)

We could have instead used a foldl:

```echoes :: [Int] -> [Int]
echoes = foldl (\ xs x -> xs ++ (replicate x x)) []
```
```GHCi> take 10 (echoes [1..])     -- not terminating
```

but only the foldr version works on infinite lists. What would happen if you just evaluate `echoes [1..]`? Try it! (If you try this in GHCi or a terminal, remember you can stop an evaluation with Ctrl-c, but you have to be quick and keep an eye on the system monitor or your memory will be consumed in no time and your system will hang.)

As a final example, `map` itself can be implemented as a fold:

```map :: (a -> b) -> [a] -> [b]
map f = foldr (\ x xs -> f x : xs) []
```

Folding takes some time to get used to, but it is a fundamental pattern in functional programming and eventually becomes very natural. Any time you want to traverse a list and build up a result from its members, you likely want a fold.

Exercises
1. Define the following functions recursively (like the definitions for `sum`, `product` and `concat` above), then turn them into a fold:
• `and :: [Bool] -> Bool`, which returns True if a list of Bools are all True, and False otherwise.
• `or :: [Bool] -> Bool`, which returns True if any of a list of Bools are True, and False otherwise.
2. Define the following functions using `foldl1` or `foldr1`:
• `maximum :: Ord a => [a] -> a`, which returns the maximum element of a list (hint: `max :: Ord a => a -> a -> a` returns the maximum of two values).
• `minimum :: Ord a => [a] -> a`, which returns the minimum element of a list (hint: `min :: Ord a => a -> a -> a` returns the minimum of two values).
3. Use a fold (which one?) to define `reverse :: [a] -> [a]`, which returns a list with the elements in reverse order.
Note that all of these are Prelude functions, so they will be always close at hand when you need them. (Also, that means you must use slightly different names in order to test your answers in GHCi.)

## Scans

A "scan" is like a cross between a `map` and a fold. Folding a list accumulates a single return value, whereas mapping puts each item through a function returning a separate result for each item. A scan does both: it accumulates a value like a fold, but instead of returning only a final value it returns a list of all the intermediate values.

Prelude contains four scan functions:

```scanl :: (a -> b -> a) -> a -> [b] -> [a]
```

`scanl` accumulates the list from the left, and the second argument becomes the first item in the resulting list. So, `scanl (+) 0 [1,2,3] = [0,1,3,6]`.

```scanl1 :: (a -> a -> a) -> [a] -> [a]
```

`scanl1` uses the first item of the list as a zero parameter. It is what you would typically use if the input and output items are the same type. Notice the difference in the type signatures between `scanl` and `scanl1`. `scanl1 (+) [1,2,3] = [1,3,6]`.

```scanr :: (a -> b -> b) -> b -> [a] -> [b]
scanr (+) 0 [1,2,3] = [6,5,3,0]

scanr1 :: (a -> a -> a) -> [a] -> [a]
scanr1 (+) [1,2,3] = [6,5,3]
```

These two functions are the counterparts of `scanl` and `scanl1` that accumulate the totals from the right.

Exercises
1. Write your own definition of `scanr`, first using recursion, and then using `foldr`. Do the same for `scanl` first using recursion then `foldl`.
2. Define the following functions:
• `factList :: Integer -> [Integer]`, which returns a list of factorials from 1 up to its argument. For example, `factList 4 = [1,2,6,24]`.

## filter

A common operation performed on lists is filtering — generating a new list composed only of elements of the first list that meet a certain condition. A simple example: making a list of only even numbers from a list of integers.

```retainEven :: [Int] -> [Int]
retainEven [] = []
retainEven (n:ns) =
-- mod n 2 computes the remainder for the integer division of n by 2.
if (mod n 2) == 0
then n : (retainEven ns)
else retainEven ns
```

This definition is somewhat verbose and specific. Prelude provides a concise and general `filter` function with type signature:

```filter :: (a -> Bool) -> [a] -> [a]
```

So, a `(a -> Bool)` function tests an elements for some condition, we then feed in a list to be filtered, and we get back the filtered list.

To write `retainEven` using `filter`, we need to state the condition as an auxiliary `(a -> Bool)` function, like this one:

```isEven :: Int -> Bool
isEven n = (mod n 2) == 0
```

And then retainEven becomes simply:

```retainEven ns = filter isEven ns
```

We used ns instead of xs to indicate that we know these are numbers and not just anything, but we can ignore that and use a more terse point-free definition:

```retainEven = filter isEven
```

This is like what we demonstrated before for `map` and the folds. Like `filter`, those take another function as argument; and using them point-free emphasizes this "functions-of-functions" aspect.

## List comprehensions

List comprehensions are syntactic sugar for some common list operations, such as filtering. For instance, instead of using the Prelude `filter`, we could write `retainEven` like this:

```retainEven es = [n | n <- es, isEven n]
```

This compact syntax may look intimidating, but it is simple to break down. One interpretation is:

• (Starting from the middle) Take the list es and draw (the "<-") each of its elements as a value n.
• (After the comma) For each drawn n test the boolean condition `isEven n`.
• (Before the vertical bar) If (and only if) the boolean condition is satisfied, append n to the new list being created (note the square brackets around the whole expression).

Thus, if `es` is [1,2,3,4], then we would get back the list [2,4]. 1 and 3 were not drawn because `(isEven n) == False `.

The power of list comprehensions comes from being easily extensible. Firstly, we can use as many tests as we wish (even zero!). Multiple conditions are written as a comma-separated list of expressions (which should evaluate to a Boolean, of course). For a simple example, suppose we want to modify `retainEven` so that only numbers larger than 100 are retained:

```retainLargeEvens :: [Int] -> [Int]
retainLargeEvens es = [n | n <- es, isEven n, n > 100]
```

Furthermore, we are not limited to using `n` as the element to be appended when generating a new list. Instead, we could place any expression before the vertical bar (if it is compatible with the type of the list, of course). For instance, if we wanted to subtract one from every even number, all it would take is:

```evensMinusOne es = [n - 1 | n <- es, isEven n]
```

In effect, that means the list comprehension syntax incorporates the functionalities of `map` and `filter`.

To further sweeten things, the left arrow notation in list comprehensions can be combined with pattern matching. For example, suppose we had a list of `(Int, Int)` tuples, and we would like to construct a list with the first element of every tuple whose second element is even. Using list comprehensions, we might write it as follows:

```firstForEvenSeconds :: [(Int, Int)] -> [Int]
firstForEvenSeconds ps = [fst p | p <- ps, isEven (snd p)] -- here, p is for pairs.
```

Patterns can make it much more readable:

```firstForEvenSeconds ps = [x | (x, y) <- ps, isEven y]
```

As in other cases, arbitrary expressions may be used before the `|`. If we wanted a list with the double of those first elements:

```doubleOfFirstForEvenSeconds :: [(Int, Int)] -> [Int]
doubleOfFirstForEvenSeconds ps = [2 * x | (x, y) <- ps, isEven y]
```

Not counting spaces, that function code is shorter than its descriptive name!

There are even more possible tricks:

```allPairs :: [(Int, Int)]
allPairs = [(x, y) | x <- [1..4], y <- [5..8]]
```

This comprehension draws from two lists, and generates all possible `(x, y)` pairs with the first element drawn from `[1..4]` and the second from `[5..8]`. In the final list of pairs, the first elements will be those generated with the first element of the first list (here, `1`), then those with the second element of the first list, and so on. In this example, the full list is (linebreaks added for clarity):

```Prelude> [(x, y) | x <- [1..4], y <- [5..8]]
[(1,5),(1,6),(1,7),(1,8),
(2,5),(2,6),(2,7),(2,8),
(3,5),(3,6),(3,7),(3,8),
(4,5),(4,6),(4,7),(4,8)]
```

We could easily add a condition to restrict the combinations that go into the final list:

```somePairs = [(x, y) | x <- [1..4], y <- [5..8], x + y > 8]
```

This list only has the pairs with the sum of elements larger than 8; starting with `(1,8)`, then `(2,7)` and so forth.

Exercises
1. Write a `returnDivisible :: Int -> [Int] -> [Int]` function which filters a list of integers retaining only the numbers divisible by the integer passed as first argument. For integers x and n, x is divisible by n if `(mod x n) == 0` (note that the test for evenness is a specific case of that).
2. Write a function `choosingTails :: [[Int]] -> [[Int]]` using list comprehension syntax with appropriate guards (filters) for empty lists returning a list of tails following a head bigger than 5:
```choosingTails  [[7,6,3],[],[6,4,2],[9,4,3],[5,5,5]]
-- [[6,3],[4,2],[4,3]]
```
3. Does the order of guards matter? You may find it out by playing with the function of the preceding exercise.
4. Over this section we've seen how list comprehensions are essentially syntactic sugar for `filter` and `map`. Now work in the opposite direction and define alternative versions of the `filter` and `map` using the list comprehension syntax.
5. Rewrite `doubleOfFirstForEvenSeconds` using `filter` and `map` instead of list comprehension.