The Way of the Java/Linked lists
Linked lists
editReferences in objects
editreference embedded reference reference!embedded node cargo
In the last chapter we saw that the instance variables of an object can be arrays, and I mentioned that they can be objects, too.
One of the more interesting possibilities is that an object can contain a reference to another object of the same type. There is a common data structure, the list, that takes advantage of this feature.
Lists are made up of nodes, where each node contains a reference to the next node in the list. In addition, each node usually contains a unit of data called the cargo. In our first example, the cargo will be a single integer, but later we will write a generic list that can contain objects of any type.
generic data structure data structure!generic
The Node class
editNode class class!Node
As usual when we write a new class, we'll start with the instance variables, one or two constructors and toString so that we can test the basic mechanism of creating and displaying the new type.
verbatim public class Node
int cargo; Node next;
public Node () cargo = 0; next = null;
public Node (int cargo, Node next) this.cargo = cargo; this.next = next;
public String toString () return cargo + "";
verbatim
The declarations of the instance variables follow naturally from the specification, and the rest follows mechanically from the instance variables. The expression cargo + "" is an awkward but concise way to convert an integer to a String.
To test the implementation so far, we would put something like this in main:
verbatim
Node node = new Node (1, null); System.out.println (node);
verbatim
The result is simply
verbatim 1 verbatim
To make it interesting, we need a list with more than one node!
verbatim
Node node1 = new Node (1, null); Node node2 = new Node (2, null); Node node3 = new Node (3, null);
verbatim
This code creates three nodes, but we don't have a list yet because the nodes are not linked. The state diagram looks like this:
figure=figs/list1.eps
To link up the nodes, we have to make the first node refer to the
second and the second node refer to the third.
verbatim
node1.next = node2; node2.next = node3; node3.next = null;
verbatim
The reference of the third node is null, which indicates that it is the end of the list. Now the state diagram looks like:
figure=figs/list2.eps
Now we know how to create nodes and link them into lists. What
might be less clear at this point is why.
Lists as collections
editcollection
The thing that makes lists useful is that they are a way of assembling multiple objects into a single entity, sometimes called a collection. In the example, the first node of the list serves as a reference to the entire list.
list!printing list!as parameter
If we want to pass the list as a parameter, all we have to pass is a reference to the first node. For example, the method printList takes a single node as an argument. Starting with the head of the list, it prints each node until it gets to the end (indicated by the null reference).
verbatim
public static void printList (Node list) Node node = list;
while (node != null) System.out.print (node); node = node.next; System.out.println ();
verbatim
To invoke this method we just have to pass a reference to the first node:
verbatim
printList (node1);
verbatim
Inside printList we have a reference to the first node of the list, but there is no variable that refers to the other nodes. We have to use the next value from each node to get to the next node.
This diagram shows the value of list and the values that node takes on:
figure=figs/list3.eps
This way of moving through a list is called a traversal,
just like the similar pattern of moving through the elements of
an array. It is common to use a loop variable like node
to refer to each of the nodes in the list in succession.
loop variable list!traversal traverse
The output of this method is
verbatim 123 verbatim
By convention, lists are printed in parentheses with commas between the elements, as in (1, 2, 3). As an exercise, modify printList so that it generates output in this format.
As another exercise, rewrite printList using a for loop instead of a while loop.
Lists and recursion
editlist!traverse recursively traverse
Recursion and lists go together like fava beans and a nice Chianti. For example, here is a recursive algorithm for printing a list backwards:
fava beans Chianti
enumerate
Separate the list into two pieces: the first node (called the head) and the rest (called the tail).
Print the tail backwards.
Print the head.
enumerate
Of course, Step 2, the recursive call, assumes that we have a way of printing a list backwards. But if we assume that the recursive call works---the leap of faith---then we can convince ourselves that this algorithm works.
leap of faith list!printing backwards
All we need is a base case, and a way of proving that for any list we will eventually get to the base case. A natural choice for the base case is a list with a single element, but an even better choice is the empty list, represented by null.
verbatim
public static void printBackward (Node list) if (list == null) return;
Node head = list; Node tail = list.next;
printBackward (tail); System.out.print (head);
verbatim
The first line handles the base case by doing nothing. The next two lines split the list into head and tail. The last two lines print the list.
We invoke this method exactly as we invoked printList:
verbatim
printBackward (node1);
verbatim
The result is a backwards list.
Can we prove that this method will always terminate? In other words, will it always reach the base case? In fact, the answer is no. There are some lists that will make this method crash.
Infinite lists
editinfinite list list!infinite loop!in list list!loop
There is nothing to prevent a node from referring back to an earlier node in the list, including itself. For example, this figure shows a list with two nodes, one of which refers to itself.
figure=figs/list4.eps
If we invoke printList on this list, it will loop forever.
If we invoke printBackward it will recurse infinitely.
This sort of behavior makes infinite lists difficult to work
with.
Nevertheless, they are occasionally useful. For example, we might represent a number as a list of digits and use an infinite list to represent a repeating fraction.
Regardless, it is problematic that we cannot prove that printList and printBackward terminate. The best we can do is the hypothetical statement, ``If the list contains no loops, then these methods will terminate. This sort of claim is called a precondition. It imposes a constraint on one of the parameters and describes the behavior of the method if the constraint is satisfied. We will see more examples soon.
precondition
The fundamental ambiguity theorem
editambiguity!fundamental theorem theorem!fundamental ambiguity
There is a part of printBackward that might have raised an eyebrow:
verbatim
Node head = list; Node tail = list.next;
verbatim
After the first assignment, head and list have the same type and the same value. So why did I create a new variable?
The reason is that the two variables play different roles. We think of head as a reference to a single node, and we think of list as a reference to the first node of a list. These ``roles are not part of the program; they are in the mind of the programmer.
variable!roles role!variable
The second assignment creates a new reference to the second node in the list, but in this case we think of it as a list. So, even though head and tail have the same type, they play different roles.
This ambiguity is useful, but it can make programs with lists difficult to read. I often use variable names like node and list to document how I intend to use a variable, and sometimes I create additional variables to disambiguate.
I could have written printBackward without head and tail, but I think it makes it harder to understand:
verbatim
public static void printBackward (Node list) if (list == null) return;
printBackward (list.next); System.out.print (list);
verbatim
Looking at the two function calls, we have to remember that printBackward treats its argument as a list and print treats its argument as a single object.
Always keep in mind the fundamental ambiguity theorem:
quote A variable that refers to a node might treat the node as a single object or as the first in a list of nodes. quote
Object methods for nodes
editmethod!object node!object method
You might have wondered why printList and printBackward are class methods. I have made the claim that anything that can be done with class methods can also be done with object methods; it's just a question of which form is cleaner.
In this case there is a legitimate reason to choose class methods. It is legal to send null as an argument to a class method, but it is not legal to invoke an object method on a null object.
verbatim Node node = null; printList (node); // legal node.printList (); // NullPointerException verbatim
This limitation makes it awkward to write list-manipulating code in a clean, object-oriented style. A little later we will see a way to get around this, though.
Modifying lists
editlist!modifying modifying lists
Obviously one way to modify a list is to change the cargo of one of the nodes, but the more interesting operations are the ones that add, remove, or reorder the nodes.
As an example, we'll write a method that removes the second node in the list and returns a reference to the removed node.
verbatim
public static Node removeSecond (Node list) Node first = list; Node second = list.next;
// make the first node refer to the third first.next = second.next;
// separate the second node from the rest of the list second.next = null; return second;
verbatim
Again, I am using temporary variables to make the code more readable. Here is how to use this method.
verbatim
printList (node1); Node removed = removeSecond (node1); printList (removed); printList (node1);
verbatim
The output is
verbatim (1, 2, 3) the original list (2) the removed node (1, 3) the modified list verbatim
Here is a state diagram showing the effect of this operation.
figure=figs/list5.eps
What happens if we invoke this method and pass a list with only one
element (a singleton)? What happens if we pass the empty list
as an argument? Is there a precondition for this method?
singleton
Wrappers and helpers
editwrapper methods method!wrapper helper method method!helper
For some list operations it is useful to divide the labor into two methods. For example, to print a list backwards in the conventional list format, (3, 2, 1) we can use the printBackwards method to print 3, 2, but we need a separate method to print the parentheses and the first node. We'll call it printBackwardNicely.
verbatim
public static void printBackwardNicely (Node list) System.out.print ("(");
if (list != null) Node head = list; Node tail = list.next; printBackward (tail); System.out.print (head); System.out.println (")");
verbatim
Again, it is a good idea to check methods like this to see if they work with special cases like an empty list or a singleton.
singleton
Elsewhere in the program, when we use this method, we will invoke printBackwardNicely directly and it will invoke printBackward on our behalf. In that sense, printBackwardNicely acts as a wrapper, and it uses printBackward as a helper.
The LinkedList class
editLinkedList class!LinkedList
There are a number of subtle problems with the way we have been implementing lists. In a reversal of cause and effect, I will propose an alternative implementation first and then explain what problems it solves.
First, we will create a new class called LinkedList. Its instance variables are an integer that contains the length of the list and a reference to the first node in the list. LinkedList objects serve as handles for manipulating lists of Node objects.
verbatim public class LinkedList
int length; Node head;
public LinkedList () length = 0; head = null;
verbatim
One nice thing about the LinkedList class is that it gives us a natural place to put wrapper functions like printBackwardNicely, which we can make an object method in the LinkedList class.
verbatim
public void printBackward () System.out.print ("(");
if (head != null) Node tail = head.next; Node.printBackward (tail); System.out.print (head); System.out.println (")");
verbatim
Just to make things confusing, I renamed printBackwardNicely. Now there are two methods named printBackward: one in the Node class (the helper) and one in the LinkedList class (the wrapper). In order for the wrapper to invoke the helper, it has to identify the class explicitly (Node.printBackward).
So, one of the benefits of the LinkedList class is that it provides a nice place to put wrapper functions. Another is that it makes it easier to add or remove the first element of a list. For example, addFirst is an object method for LinkedLists; it takes an int as an argument and puts it at the beginning of the list.
verbatim
public void addFirst (int i) Node node = new Node (i, head); head = node; length++;
verbatim
As always, to check code like this it is a good idea to think about the special cases. For example, what happens if the list is initially empty?
Invariants
editinvariant object invariant list!well-formed
Some lists are ``well-formed; others are not. For example, if a list contains a loop, it will cause many of our methods to crash, so we might want to require that lists contain no loops. Another requirement is that the length value in the LinkedList object should be equal to the actual number of nodes in the list.
Requirements like this are called invariants because, ideally, they should be true of every object all the time. Specifying invariants for objects is a useful programming practice because it makes it easier to prove the correctness of code, check the integrity of data structures, and detect errors.
One thing that is sometimes confusing about invariants is that there are some times when they are violated. For example, in the middle of addFirst, after we have added the node, but before we have incremented length, the invariant is violated. This kind of violation is acceptable; in fact, it is often impossible to modify an object without violating an invariant for at least a little while. Normally the requirement is that every method that violates an invariant must restore the invariant.
If there is any significant stretch of code in which the invariant is violated, it is important for the comments to make that clear, so that no operations are performed that depend on the invariant.
documentation
Glossary
editlist node cargo link generic data structure precondition invariant wrapper
description
[list:] A data structure that implements a collection using a sequence of linked nodes.
[node:] An element of a list, usually implemented as an object that contains a reference to another object of the same type.
[cargo:] An item of data contained in a node.
[link:] An object reference embedded in an object.
[generic data structure:] A kind of data structure that can contain data of any type.
[precondition:] An assertion that must be true in order for a method to work correctly.
[invariant:] An assertion that should be true of an object at all times (except maybe while the object is being modified).
[wrapper method:] A method that acts as a middle-man between a caller and a helper method, often offering an interface that is cleaner than the helper method's.