The Way of the Java/Arrays

Arrays

edit

arrays

array
type!array

An array is a set of values where each value is identified by an index. You can make an array of ints, doubles, or any other type, but all the values in an array have to have the same type.

Syntactically, array types look like other Java types except they are followed by []. For example, int[] is the type ``array of integers and double[] is the type ``array of doubles.

You can declare variables with these types in the usual ways:

verbatim

   int[] count;
   double[] values;

verbatim

Until you initialize these variables, they are set to null. To create the array itself, use the new command.

verbatim

   count = new int[4];
   values = new double[size];

verbatim

The first assignment makes count refer to an array of 4 integers; the second makes values refer to an array of doubles. The number of elements in values depends on size. You can use any integer expression as an array size.

null
state diagram

The following figure shows how arrays are represented in state diagrams:


figure=figs/array.eps


The large numbers inside the boxes are the elements of the array. The small numbers outside the boxes are the indices used to identify each box. When you allocate a new array, the elements are initialized to zero.

Accessing elements

edit
element
array!element

To store values in the array, use the [] operator. For example count[0] refers to the ``zeroeth element of the array, and count[1] refers to the ``oneth element. You can use the [] operator anywhere in an expression:

verbatim

   count[0] = 7;
   count[1] = count[0] * 2;
   count[2]++;
   count[3] -= 60;

verbatim

All of these are legal assignment statements. Here is the effect of this code fragment:


figure=figs/array2.eps


By now you should have noticed that the four elements of this array are numbered from 0 to 3, which means that there is no element with the index 4. This should sound familiar, since we saw the same thing with String indices. Nevertheless, it is a common error to go beyond the bounds of an array, which will cause an ArrayOutOfBoundsException. As with all exceptions, you get an error message and the program quits.

exception!ArrayOutOfBounds
run-time error
index
expression

You can use any expression as an index, as long as it has type int. One of the most common ways to index an array is with a loop variable. For example:

verbatim

   int i = 0;
   while (i < 4) 
     System.out.println (count[i]);
     i++;
   

verbatim

This is a standard while loop that counts from 0 up to 4, and when the loop variable i is 4, the condition fails and the loop terminates. Thus, the body of the loop is only executed when i is 0, 1, 2 and 3.

loop
loop variable
variable!loop

Each time through the loop we use i as an index into the array, printing the ith element. This type of array traversal is very common. Arrays and loops go together like fava beans and a nice Chianti.

fava beans
Chianti

Copying arrays

edit
array!copying

When you copy an array variable, remember that you are copying a reference to the array. For example:

verbatim

   double[] a = new double [3];
   double[] b = a;

verbatim

This code creates one array of three doubles, and sets two different variables to refer to it. This situation is a form of aliasing.


figure=figs/array3.eps


Any changes in either array will be reflected in the other. This is not usually the behavior you want; instead, you should make a copy of the array, by allocating a new array and copying each element from one to the other.

verbatim

   double[] b = new double [3];
   int i = 0;
   while (i < 4) 
     b[i] = a[i];
     i++;
   

verbatim

for loops

edit

The loops we have written so far have a number of elements in common. All of them start by initializing a variable; they have a test, or condition, that depends on that variable; and inside the loop they do something to that variable, like increment it.

loop!for
for
statement!for

This type of loop is so common that there is an alternate loop statement, called for, that expresses it more concisely. The general syntax looks like this:

verbatim

   for (INITIALIZER; CONDITION; INCREMENTOR) 
     BODY
   

verbatim

This statement is exactly equivalent to

verbatim

   INITIALIZER;
   while (CONDITION) 
     BODY
     INCREMENTOR
   

verbatim

except that it is more concise and, since it puts all the loop-related statements in one place, it is easier to read. For example:

verbatim

   for (int i = 0; i < 4; i++) 
     System.out.println (count[i]);
   

verbatim

is equivalent to

verbatim

   int i = 0;
   while (i < 4) 
     System.out.println (count[i]);
     i++;
   

verbatim

As an exercise, write a for loop to copy the elements of an array.

Arrays and objects

edit
object!compared to array
array!compared to object

In many ways, arrays behave like objects:

itemize

When you declare an array variable, you get a reference to an array.

You have to use the new command to create the array itself.

When you pass an array as an argument, you pass a reference, which means that the invoked method can change the contents of the array.

itemize

Some of the objects we have looked at, like Rectangles, are similar to arrays, in the sense that they are named collection of values. This raises the question, ``How is an array of 4 integers different from a Rectangle object?

collection

If you go back to the definition of ``array at the beginning of the chapter, you will see one difference, which is that the elements of an array are identified by indices, whereas the elements (instance variables) of an object have names (like x, width, etc.).

Another difference between arrays and objects is that all the elements of an array have to be the same type. Although that is also true of Rectangles, we have seen other objects that have instance variables with different types (like Time).

Array length

edit
length!array
array!length

Actually, arrays do have one named instance variable: length. Not surprisingly, it contains the length of the array (number of elements). It is a good idea to use this value as the upper bound of a loop, rather than a constant value. That way, if the size of the array changes, you won't have to go through the program changing all the loops; they will work correctly for any size array.

verbatim

   for (int i = 0; i < a.length; i++) 
     b[i] = a[i];
   

verbatim

The last time the body of the loop gets executed, i is a.length - 1, which is the index of the last element. When i is equal to a.length, the condition fails and the body is not executed, which is a good thing, since it would cause an exception. This code assumes that the array b contains at least as many elements as a.

As an exercise, write a method called cloneArray that takes an array of integers as a parameter, creates a new array that is the same size, copies the elements from the first array into the new one, and then returns a reference to the new array.

Random numbers

edit

random pseudorandom

random number
deterministic
nondeterministic

Most computer programs do the same thing every time they are executed, so they are said to be deterministic. Usually, determinism is a good thing, since we expect the same calculation to yield the same result. For some applications, though, we would like the computer to be unpredictable. Games are an obvious example, but there are many more.

Making a program truly nondeterministic turns out to be not so easy, but there are ways to make it at least seem nondeterministic. One of them is to generate random numbers and use them to determine the outcome of the program. Java provides a built-in method that generates pseudorandom numbers, which are not truly random in the mathematical sense, but for our purposes, they will do.

Check out the documentation of the random method in the Math class. The return value is a double between 0.0 and 1.0. Each time you invoke random you get a different randomly-generated number. To see a sample, run this loop:

verbatim

   for (int i = 0; i < 10; i++) 
     double x = Math.random ();
     System.out.println (x);
   

verbatim

To generate a random double between 0.0 and an upper bound like high, you can multiply x by high. How would you generate a random number between low and high? How would you generate a random integer?

Statistics

edit
statistics
distribution
mean

The numbers generated by random are supposed to be distributed uniformly. If you have taken statistics, you know what that means. Among other things, it means that if we divide the range of possible values into equal sized ``buckets, and count the number of times a random value falls in each bucket, each bucket should get the same number of hits (eventually).

In the next few sections, we will write programs that generate a sequence of random numbers and check whether this property holds true.

Array of random numbers

edit

The first step is to generate a large number of random values and store them in an array. By ``large number, of course, I mean 8. It's always a good idea to start with a manageable number, to help with debugging, and then increase it later.

The following method takes a single argument, the size of the array. It allocates a new array of doubles, fills it with random values, and returns a reference to the new array.

verbatim

 public static double[] randomArray (int n) 
   double[] a = new double[n];
   for (int i = 0; i<a.length; i++) 
     a[i] = Math.random ();
   
   return a;
 

verbatim

The return type is double[], which means that this method returns an array of doubles. To test this method, it is convenient to have a method that prints the contents of an array.

verbatim

 public static void printArray (double[] a) 
   for (int i = 0; i<a.length; i++) 
     System.out.println (a[i]);
   
 

verbatim

The following code generates an array and prints it:

verbatim

   int numValues = 8;
   double[] array = randomArray (numValues);
   printArray (array);

verbatim

On my machine the output is

verbatim 0.7344558779885422 0.6224282219647016 0.09591424515329172 0.2992298398883563 0.7736458103088713 0.7069110192991597 0.7042440765950522 0.977839532249852 verbatim

which is pretty random-looking. Your results may differ.

If these numbers are really random, we expect half of them to be greater than 0.5 and half to be less. In fact, six are greater than 0.5, so that's a little high.

If we divide the range into four buckets---from 0.0 to 0.25, 0.25 to 0.5, 0.5 to 0.75, and 0.75 to 1.0---we expect 2 values to fall in each bucket. In fact, we get 1, 1, 4, 2. Again, not exactly what we expected.

Do these results mean the values are not really random? It's hard to tell. With so few values, the chances are slim that we would get exactly what we expect. But as the number of values increases, the outcome should be more predictable.

To test this theory, we'll write some programs that divide the range into buckets and count the number of values in each.

Counting

edit
traverse!array
array!traverse
looping and counting
counter

A good approach to problems like this is to think of simple methods that are easy to write, and that might turn out to be useful. Then you can combine them into a solution. Of course, it is not easy to know ahead of time which methods are likely to be useful, but as you gain experience you will have a better idea.

Also, it is not always obvious what sort of things are easy to write, but a good approach is to look for subproblems that fit a pattern you have seen before.

Back in Section loopcount we looked at a loop that traversed a string and counted the number of times a given letter appeared. You can think of this program as an example of a pattern called ``traverse and count. The elements of this pattern are:

itemize

A set or container that can be traversed, like an array or a string.

A test that you can apply to each element in the container.

A counter that keeps track of how many elements pass the test.

itemize

In this case, I have a method in mind called inBucket that counts the number of elements in an array that fall in a given bucket. The parameters are the array and two doubles that specify the lower and upper bounds of the bucket.

verbatim

 public static int inBucket (double[] a, double low, double high) 
   int count = 0;
   for (int i=0; i<a.length; i++) 
     if (a[i] >= low && a[i] < high) count++;
   
   return count;
 

verbatim

I haven't been very careful about whether something equal to low or high falls in the bucket, but you can see from the code that low is in and high is out. That should prevent me from counting any elements twice.

Now, to divide the range into two pieces, we could write

verbatim

   int low = inBucket (a, 0.0, 0.5);
   int high = inBucket (a, 0.5, 1.0);

verbatim

To divide it into four pieces:

verbatim

   int bucket1 = inBucket (a, 0.0, 0.25);
   int bucket2 = inBucket (a, 0.25, 0.5);
   int bucket3 = inBucket (a, 0.5, 0.75);
   int bucket4 = inBucket (a, 0.75, 1.0);

verbatim

You might want to try out this program using a larger numValues. As numValues increases, are the numbers in each bucket levelling off?

Many buckets

edit
bucket
histogram

Of course, as the number of buckets increases, we don't want to have to rewrite the program, especially since the code is getting big and repetitive. Any time you find yourself doing something more than a few times, you should be looking for a way to automate it.

Let's say that we wanted 8 buckets. The width of each bucket would be one eighth of the range, which is 0.125. To count the number of values in each bucket, we need to be able to generate the bounds of each bucket automatically, and we need to have some place to store the 8 counts.

We can solve the first problem with a loop:

verbatim

   int numBuckets = 8;
   double bucketWidth = 1.0 / numBuckets;
   for (int i = 0; i < numBuckets; i++) 
     double low = i * bucketWidth;
     double high = low + bucketWidth;
     System.out.println (low + " to " + high);
   

verbatim

This code uses the loop variable i to multiply by the bucket width, in order to find the low end of each bucket. The output of this loop is:

verbatim 0.0 to 0.125 0.125 to 0.25 0.25 to 0.375 0.375 to 0.5 0.5 to 0.625 0.625 to 0.75 0.75 to 0.875 0.875 to 1.0 verbatim

You can confirm that each bucket is the same width, that they don't overlap, and that they cover the whole range from 0.0 to 1.0.

Now we just need a way to store 8 integers, preferably so we can use an index to access each one. Immediately, you should be thinking ``array!

What we want is an array of 8 integers, which we can allocate outside the loop; then, inside the loop, we'll invoke inBucket and store the result:

verbatim

   int numBuckets = 8;
   int[] buckets = new int [8];
   double bucketWidth = 1.0 / numBuckets;
   for (int i = 0; i<numBuckets; i++) 
     double low = i * bucketWidth;
     double high = low + bucketWidth;
     //System.out.println (low + " to " + high);
     buckets[i] = inBucket (a, low, high);
   

verbatim

The tricky thing here is that I am using the loop variable as an index into the buckets array, in addition to using it to compute the range of each bucket.

This code works. I cranked the number of values up to 1000 and divided the range into 8 buckets. The output is:

verbatim 129 109 142 118 131 124 121 126 verbatim

which is pretty close to 125 in each bucket. At least, it's close enough that I can believe the random number generator is working.

A single-pass solution

edit

Although this code works, it is not as efficient as it could be. Every time it invokes inBucket, it traverses the entire array. As the number of buckets increases, that gets to be a lot of traversals.

It would be better to make a single pass through the array, and for each value, compute which bucket it falls in. Then we could increment the appropriate counter.

In the previous section, we took an index, i, and multiplied it by the bucketWidth in order to find the lower bound of a given bucket. Now we want to take a value in the range 0.0 to 1.0, and find the index of the bucket where it falls.

Since this problem is the inverse of the previous problem we might guess that we should divide by the bucketwidth instead of multiplying. That guess is correct.

Remember that since bucketWidth = 1.0 / numBuckets, dividing by bucketWidth is the same as multiplying by numBuckets. If we take a number in the range 0.0 to 1.0 and multiply by numBuckets, we get a number in the range from 0.0 to numBuckets. If we round that number to the next lower integer, we get exactly what we are looking for---the index of the appropriate bucket.

verbatim

   int numBuckets = 8;
   int[] buckets = new int [8];
   for (int i = 0; i < numValues; i++) 
     int index = (int) (a[i] * numBuckets);
     buckets[index]++;
   

verbatim

Here I am using a typecast to round the value down to the next integer and convert it to type int at the same time.

Is it possible for this calculation to produce an index that is out of range (either negative or greater than a.length-1)? If so, how would you fix it?

histogram

An array like buckets, that contains counts of the number of values in each range, is called a histogram. As an exercise, write a method called histogram that takes an array and a number of buckets as parameters, and that returns a histogram with the given number of buckets.

Glossary

edit

description

[array:] A named collection of values, where all the values have the same type, and each value is identified by an index.

[collection:] Any data structure that contains a set of items or elements.

[element:] One of the values in an array. The [] operator selects elements of an array.

[index:] An integer variable or value used to indicate an element of an array.

[deterministic:] A program that does the same thing every time it is invoked.

[pseudorandom:] A sequence of numbers that appear to be random, but which are actually the product of a deterministic computation.

[histogram:] An array of integers where each integer counts the number of values that fall into a certain range.

array
collection
element
index
deterministic
pseudorandom

description