Introduction to Programming Languages/Functional Data Structures

Most data structures used to implement Abstract Data Types such as queues, stacks and sequences were designed with an imperative mindset. They usually assume data is mutable and random access to memory is fast.

In functional languages such as ML and Haskell, random access is not the rule, but the exception. These languages discourage mutability and random access and some forbid it altogether. Nevertheless, we can still implement most of the more popular data-structures in these languages, keeping the same complexity bounds that we are likely to find in imperative implementations. We shall see some examples of such implementations in the rest of this section.

Stacks

A stack is an abstract data type which holds a collection of items. It is capable of executing two operations: push and pop. Push is responsible for adding a new item to the stack and pop retrieves the last item inserted, if there is one. Stacks are a type of LIFO data structure, an acronym which stands for Last-In, First-Out.

Stacks can be implemented in many different ways. In python, for instance, we can use the builtin lists to do it as follows:

  class Stack:
    def create_stack(self):
      self.data = []
    def push(self, item):
      self.data.append(item)
    def pop(self)
      return self.data.pop()

Since both append and pop are $O(1)$ in python, our stack implementation is capable of conducting both operations in constant time [1].

The ML implementation is as follows:

  datatype 'a stack = Stack of ('a list)
                    | Empty
  fun push item (Stack s)         = Stack (item::s)
  fun pop (Stack (first_item::s)) = (first_item, Stack s)

There are a few differences between this implementation and its python counterpart that are worth noticing. First of all, observe how pattern matching is used on both push and pop to obtain the contents of the stack.

Both implementations chose to store the contents of the stack within a list. Lists are built into both python and ML. However, they are used in different ways in these languages. In ML, lists are Algebraic Data Types backed by a little syntax sugar. They can be pattern matched on and have an special operator denoted by :: which we call cons. The cons operator allows us to extract the first element of a list (also called its head) and to append a new element to an existing list. The empty list is denoted by nil.

Another important difference appears on the implementation of pop. SML discourages the usage of mutable data. To work around that, we have to return both the popped item and the stack that is obtained after the operation is carried out. This is a pattern common to most functional implementations of data structures. When using our stack, we need to keep track of the most recent version. For instance, in an ML prompt we could write:

  $ sml stack.sml
  - val v = Stack nil;
  val v = Stack [] : 'a stack
  - val v2 = push 2 v;
  val v2 = Stack [2] : int stack
  - pop v2;
  val it = (2,Stack []) : int * int stack
  - v2;
  val it = Stack [2] : int stack
  - val (item, v3) = pop v2;
  val item = 2 : int
  val v3 = Stack [] : int stack

Since both cons (::) and pattern matching are performed in constant time, push and pop in our ML implementation are also $O(1)$ . As we will see later on this chapter, this is not always the case. It is not uncommon for functional implementations of abstract data types to be slower by a factor of $O(\log n)$ or more than their imperative counterparts.

Queues

Another common data structure is the queue. Just like stacks, queues are containers that support two operations: push and pop. However, while stacks provide Last-In First-Out ( LIFO ) access, queues are First-In First-Out ( FIFO ) structures. This acronym means that the first item inserted on the queue with push is also the first item to be returned by pop. Once more, we will begin our discussion with a simple python implementation.

  class Queue:
    def __init__(self):
      self.data = []
    def push(self, item):
      self.data.append(item)
    def pop(self):
      return self.data.pop(0)

Just like our stack example, this first python implementation uses a list to store the enqueued items. However, the time complexities are much worse than in our first example. While push remains $O(1)$ , pop is now $O(n)$ , because .pop(0) is also $O(n)$ . [2]

We can improve upon our first attempt to implement queues in at least two ways. The first way is to roll out our own list implementation instead of using python's built in lists. In doing that, we can include links to both the previous and the next element, effectively implementing a data structure called Doubly linked list . We present the code for this new implementation bellow:

  class Node:
    def __init__(self):
      self.prev = None
      self.next = None
      self.item = None

  class Queue:
    def __init__(self):
      self.begin = Node()
      self.end = Node()
      self.begin.next = self.end
      self.end.prev = self.begin

    def push(self, item):
      new_node = Node()
      new_node.item = item
      new_node.next = self.end
      new_node.prev = self.end.prev
      new_node.next.prev = new_node
      new_node.prev.next = new_node

    def pop(self):
      popped_node = self.begin.next
      new_first = popped_node.next
      self.begin.next = new_first
      new_first.prev = self.begin
      return popped_node.item

This code has much better asymptotic behavior than the previous implementation. Thanks to the doubly linked list, both push and pop are now $O(1)$ . However, the program became a lot more complex due to the many pointer manipulations that now are required. To push a new item, for instance, we have to manipulate 4 references that connect nodes. That can be quite hard to get right.

A better approach is to use two python built-in lists instead of one. We call these lists push_list and pop_list. The first, push_list, stores elements when they arrive. The list pop_list is where elements are popped from. When pop_list is empty, we simply populate it moving the elements from push_list.

  class SimplerQueue:
    def __init__(self):
      self.push_list = []
      self.pop_list = []
    def push(self, item):
      self.push_list.append(item)
    def pop(self):
      if not self.pop_list:
        while self.push_list:
          self.pop_list.append(self.push_list.pop())
      return self.pop_list.pop()

This code has the desired asymptotic properties: both push and pop are $O(1)$ . To see why that is the case, consider a sequence of $n$ push and $n$ pop operations. Each push is $O(1)$ , as it simply appends the elements to a python list. As we have seen before, the append operation is $O(1)$ in python lists. The operation pop, on the other hand, might be more expensive because items sometimes have to be moved. However, each pushed item is moved at most once. So the while loop inside pop cannot iterate for more than $n$ times in total when $n$ items are popped. This means that, while some pop operations might be a little more expensive, each pop will run in what is called amortized constant time Amortized analysis.

The trick of using two python lists to get $O(1)$ complexity is useful beyond python. To see why that is the case, notice that both push_list and pop_list are being used as stacks. All we do is to append items to their front and remove items from their back. As we have seen before, stacks are easy to implement in ML.

But before diving into the ML implementation, we need to make a small digression. It so happens that in order to implement an efficient queue in ML, we need to be able to reverse a list in $O(n)$ time. On a first attempt, one might try to use the following code to do that:

  fun reverse nil = nil
    | reverse (first_element::others) = (reverse others) @ [first_element]

That code, however, takes quadratic time. The problem is the @ operator used to merge the two lists when reversing. It takes time which is linear on the size of the first list. Since @ is used once per element and there are n elements, the complexity becomes $O(n^{2})$

We can work around this shortcoming introducing a second parameter, which stores the reversed list while this list is constructed.

  fun do_reverse nil    accumulator = accumulator
    | do_reverse (h::t) accumulator = do_reverse t (h::accumulator)

We might wrap the function above in a call that initializes the extra parameter, to give it a nicer interface:

  fun reverse x = do_reverse x nil

Now that we can reverse lists in $O(n)$ , push and pop can be implemented within the same complexity bounds as we have seen in python. The code is as follows:

  datatype 'a queue = Queue of 'a list * 'a list
  fun push item (Queue (push_list, pop_list)) =       Queue (item::push_list, pop_list)
  fun pop (Queue (push_list, first_item::pop_list)) = (first_item, Queue (push_list, pop_list))
    | pop (Queue (push_list, nil))                  = pop (Queue (nil, reverse push_list))

Binary Trees

Binary search trees are among the most important data structures in computer science. They can store sets of elements in an ordered way and, if we are lucky, conduct the following operations on these sets in O(log n):

Insert a new element
Search an element
Find the minimum element

By being lucky, we mean that there are sequences of operations that will get these complexities down to O(n). We will show how to deal with them in the next session. It is also worth noticing that this is merely a subset of what trees can do. However, some operations such as deletion can become rather complex and we will not be covering them here.

Trees are made of nodes. Each node contains at least tree things: its data and two subtrees which we will call left and right. Left contains elements strictly smaller than the data stored on the node, whereas right contains elements which are greater. Both left and right are trees themselves, which leads to the following recursive definition in ML:

   datatype 'a tree = Leaf
                    | Node of 'a tree * 'a * 'a tree

As usual, we begin with a python implementation of the three operations: insert, search and find minimum. Just like the datatype above, the python implementation starts with a description of a tree.

  class Tree:
    def __init__(self, left=None, data=None, right=None):
      self.left = left
      self.right = right
      self.data = data

  def insert(node, data):
    if node is None:
      return Tree(None, data, None)
    elif data < node.data:
      return TreeNode(insert(node.left, data), node.data, node.right)
    elif data > node.data:
      return TreeNode(node.left, node.data, insert(node.right, data))

  def find(node, data):
    if node is None:
      return False
    elif data < node.data:
      return find(node.left, data)
    elif data > node.data:
      return find(node.right, data)
    else:
      return True

  def find_min(node):
    if node.left:
      return find_min(node.left)
    else:
      return node.data

This code is actually much more functional than the examples we have seen before. Notice how the tree operations are implemented recursively. In fact, they can be translated to ML rather easily. The equivalent ML code is shown below.

datatype 'a tree = Node of 'a tree * 'a * 'a tree
                 | Empty

fun insert item Empty                      = Node (Empty, item, Empty)
  | insert item (Node (left, data, right)) = 
  if item < data then 
  	(Node (insert item left, data, right))
  else if item > data then
  	(Node (left, data, insert item right))
  else 
  	(Node (left, data, right))

fun find item Empty = false
  | find item (Node (left, data, right)) = 
  if item < data then 
  	find item left 
  else if item > data then
  	find item right
  else 
  	true

fun find_min (Node (Empty, data, _)) = data
  | find_min (Node (left, _, _))     = find_min left

Notice how the two implementations are quite similar. This happens because algorithms that manipulate trees have a simple recursive definition, and recursive algorithms are a natural fit for functional programming languages.

There is, however, one important difference. Like what happened on the stack implementation, a new tree has to be returned after an insertion. It is the responsibility of the caller to keep track of that. Below, we present an example of how the tree above could be used from the ML prompt.

- val t = Empty;
val t = Empty : 'a tree
- val t0 = Empty;
val t0 = Empty : 'a tree
- val t1 = insert 1 t0; 
val t1 = Node (Empty,1,Empty) : int tree
- val t2 = insert 2 t1;
val t2 = Node (Empty,1,Node (Empty,2,Empty)) : int tree
- val t3 = insert 0 t2;
val t3 = Node (Node (Empty,0,Empty),1,Node (Empty,2,Empty)) : int tree
- find 0 t3;
val it = true : bool
- find 0 t1;
val it = false : bool

Tries

The tree implementation discussed on the previous section has a serious drawback. For certains sequences of insertions, it can become too deep, which in turn increases the lookup and insertion times to $O(n)$ . To see that, consider this sequence of insertions:

- val t = Empty;
val t = Empty : 'a tree
- val t = insert 10 t;
val t = Node (Empty,10,Empty) : int tree
- val t = insert 9 t;
val t = Node (Node (Empty,9,Empty),10,Empty) : int tree
- val t = insert 8 t;
val t = Node (Node (Node #,9,Empty),10,Empty) : int tree
- val t = insert 7 t;
val t = Node (Node (Node #,9,Empty),10,Empty) : int tree
- val t = insert 6 t;
val t = Node (Node (Node #,9,Empty),10,Empty) : int tree
...

Although the ML interpreter stops printing the complete tree after a few insertions, the pattern is clear. By inserting nodes in order, the tree effectively becomes a list. Each node has only one child subtree, which is the one at its left.

There are many approaches to solve this problem. One popular approach is to use a balanced binary tree such as a Red-Black tree. Balanced trees incorporate some changes to the insertion algorithm to make sure insertions do not make the tree too deep. However, these changes can be complex.

Another approach is to incorporate an extra key to each node. It can be shown that if these extra keys are chosen randomly and we are able to write the insertion procedure in such a way that the extra key of a node is always bigger than the extra keys of its children, then the resulting tree will likely be shallow. This is known as Treap. Treaps are much easier to implement than balanced trees. However, simpler alternatives still exist.

A third approach is to explore particular properties of the keys. This is the approach that we will follow in this section. Specifically, we will assume that the items stored on nodes are fixed size integers. Given this assumption, we will use each bit in the key's binary representation to position it on the tree. This yields a data structure called Trie. Tries are much simpler to maintain, because they do not require balancing.

This time, we will begin with the ML implementation. A Trie is simply a tree with two kinds of nodes: internal nodes and leafs. We can declare it as follows:

datatype trie = Node of trie * trie
              | Empty
              | Leaf

Each node corresponds to a bit in the binary representation of the keys. The position of the bit is given by the depth of the node. For instance, the root node corresponds to the leftmost bit and its children correspond to the second bit from left to right. Keys that begin with 0 are recursively stored in the left subtree. We call this left subtree low. Keys that have 1 on the leftmost bit are stored to the right and we call that subtree high. The special value Empty is used to indicate that there is nothing on a particular subtree. Leaf indicates that we have reached the last bit of a key and that it is indeed present.

Since ML has no builtin bitwise operations, we begin by defining two functions to recover a bit in the binary representation of a number and to set it to one. Notice that these operations are linear on the number of bits that a key might have. In a language with support to bitwise operations, such as C or Java, all these operations could be implemented to run in O(1).

fun getbit n 1 = n mod 2
  | getbit n i = getbit (n div 2) (i - 1)

fun setbit n 1 = 
    if n mod 2 = 0 then n + 1 
    else n
  | setbit n i = 
    if n mod 2 = 0 then 2 * setbit (n div 2) (i-1)  
    else 2 * setbit (n div 2) (i-1) + 1

The code of the trie follows. It implements the same operations of the binary tree with a fundamental advantage: as the bits are assigned to nodes based on how deep they are on the tree and keys are usually small, the tree can never become too deep. In fact, its depth never exceeds $O(\log n)$ where $n$ is the maximum value any key can have. Thanks to that property, the operations described are also $O(\log n)$ . In typical hardware, $\log n$ never exceeds 64.

fun do_insert item _     0 = Leaf
  | do_insert item Empty b = 
    if getbit item b = 0 then Node (do_insert item Empty (b - 1), Empty)
    else                      Node (Empty, do_insert item Empty (b - 1))
  | do_insert item (Node (low, high)) b = 
    if getbit item b = 0 then Node (do_insert item low (b - 1), high)
    else                      Node (low, do_insert item high (b - 1))
  | do_insert item Leaf b = Leaf

fun do_find item Leaf               0 = true
  | do_find item Empty              b = false
  | do_find item (Node (low, high)) b = 
    if getbit item b = 0 then do_find item low  (b - 1)
    else                      do_find item high (b - 1)

fun do_find_min Leaf                  b = 0
  | do_find_min (Node (Empty, high )) b = setbit (do_find_min high (b - 1)) b
  | do_find_min (Node (low,   _   ))  b = do_find_min low (b - 1)

The three functions work by transversing the trie while keeping track of the bit they are currently dealing with. To make things easier, we can use the following aliases assuming that keys are no longer than 10 bits.

fun insert item t = do_insert item t 10;
fun find item t = do_find item t 10;
fun find_min t = do_find_min t 10;

Using the trie implementation from the command prompt is just as easy as using a binary tree.

- val t = Empty;
val t = Empty : trie
- val t = insert 1 t;
val t = Node (Node (Node #,Empty),Empty) : trie
- val t = insert 2 t;
val t = Node (Node (Node #,Empty),Empty) : trie
- find 1 t;
val it = true : bool
- find 2 t;
val it = true : bool
- find 3 t;
val it = false : bool
- find_min t;
val it = 1 : int

← Algebraic_Data_Types · Memory_Allocation →