C++ Programming/Operators/Arrays

Arrays
edit

An array stores a constant-sized sequential set of blocks, each block containing a value of the selected type under a single name. Arrays often help organize collections of data efficiently and intuitively.

It is easiest to think of an array as simply a list with each value as an item of the list. Where individual elements are accessed by their position in the array called its index, also known as subscript. Each item in the array has an index from 0 to (the size of the array) -1, indicating its position in the array.

Advantages of arrays include:

  • Random access in O(1) (Big O notation)
  • Ease of use/port: Integrated into most modern languages

Disadvantages include:

  • Constant size
  • Constant data-type
  • Large free sequential block to accommodate large arrays
  • When used as non-static data members, the element type must allow default construction
  • Arrays do not support copy assignment (you cannot write arraya = arrayb)
  • Arrays cannot be used as the value type of a standard container
  • Syntax of use differs from standard containers
  • Arrays and inheritance don't mix (an array of Derived is not an array of Base, but can too easily be treated like one)

Note:
If complexity allows you should consider the use of containers (as in the C++ Standard Library). You should and can use for example std::vector which are as fast as arrays in most situations, can be dynamically resized, support iterators, and lets you treat the storage of the vector just like an array.

In C++11, std::array provides a fixed size array which is guaranteed to be as efficient as an old-style array but with some advantages such as being able to be queried for its size, using iterators like other containers, and having a copy assignment operator.

(Modern C allows VLAs, variable length arrays, but these are not used in C++, which already had a facility for re-sizable arrays in std::vector.)

The pointer operator as you will see is similar to the array operator.


For example, here is an array of integers, called List with 5 elements, numbered 0 to 4. Each element of the array is an integer. Like other integer variables, the elements of the array start out uninitialized. That means it is filled with unknown values until we initialize it by assigning something to it. (Remember primitive types in C are not initialized to 0.)

Index Data
00 unspecified
01 unspecified
02 unspecified
03 unspecified
04 unspecified

Since an array stores values, what type of values and how many values to store must be defined as part of an array declaration, so it can allocate the needed space. The size of array must be a const integral expression greater than zero. That means that you cannot use user input to declare an array. You need to allocate the memory (with operator new[]), so the size of an array has to be known at compile time. Another disadvantage of the sequential storage method is that there has to be a free sequential block large enough to hold the array. If you have an array of 500,000,000 blocks, each 1 byte long, you need to have roughly 500 megabytes of sequential space to be free; Sometimes this will require a defragmentation of the memory, which takes a long time.

To declare an array you can do:

int numbers[30]; // creates an array of 30 integers

or

char letters[4]; // create an array of 4 characters

and so on...

to initialize as you declare them you can use:

int vector[6]={0,0,1,0,0,0};

this will not only create the array with 6 int elements but also initialize them to the given values.

If you initialize the array with less than the full number of elements, the remaining elements are set to a default value - zero in the case of numbers.

int vector[6]={0,0,1}; // this is the same as the example above

If you fully initialize the array as you declare it, you can allow the compiler to work out the size of the array:

int vector[]={0,0,1,0,0,0};  // the compiler can see that there are 6 elements
Assigning and accessing data
edit

You can assign data to the array by using the name of the array, followed by the index.

For example to assign the number 200 into the element at index 2 in the array

 
List[2] = 200;

will give

Index Data
00 unspecified
01 unspecified
02 200
03 unspecified
04 unspecified

You can access the data at an element of the array the same way.

std::cout << List[2] << std::endl;

This will print 200.

Basically working with individual elements in an array is no different then working with normal variables.

As you see accessing a value stored in an array is easy. Take this other example:

int x;
x = vector[2];

The above declaration will assign x the valued store at index 2 of variable vector which is 1.

Arrays are indexed starting at 0, as opposed to starting at 1. The first element of the array above is vector[0]. The index to the last value in the array is the array size minus one. In the example above the subscripts run from 0 through 5. C++ does not do bounds checking on array accesses. The compiler will not complain about the following:

char y;
int z = 9;
char vector[6] = { 1, 2, 3, 4, 5, 6 };
  
// examples of accessing outside the array. A compile error is not raised
y = vector[15];
y = vector[-4];
y = vector[z];

During program execution, an out of bounds array access does not always cause a run time error. Your program may happily continue after retrieving a value from vector[-1]. To alleviate indexing problems, the sizeof expression is commonly used when coding loops that process arrays.

int ix;
short anArray[]= { 3, 6, 9, 12, 15 };
  
for (ix=0; ix< (sizeof(anArray)/sizeof(short)); ++ix) {
  DoSomethingWith( anArray[ix] );
}

Notice in the above example, the size of the array was not explicitly specified. The compiler knows to size it at 5 because of the five values in the initializer list. Adding an additional value to the list will cause it to be sized to six, and because of the sizeof expression in the for loop, the code automatically adjusts to this change.

multidimensional arrays
edit

You can also use multi-dimensional arrays. The simplest type is a two dimensional array. This creates a rectangular array - each row has the same number of columns. To get a char array with 3 rows and 5 columns we write...

char two_d[3][5];

To access/modify a value in this array we need two subscripts:

char ch;
ch = two_d[2][4];

or

two_d[0][0] = 'x';
example
edit

There are also weird notations possible:

int a[100];
int i = 0;
if (a[i]==i[a])
  printf("Hello World!\n");

a[i] and i[a] point to the same location. You will understand this better after knowing about pointers.

To get an array of a different size, you must explicitly deal with memory using realloc, malloc, memcpy, etc.

Why start at 0?
edit

Most programming languages number arrays from 0. This is useful in languages where arrays are used interchangeably with a pointer to the first element of the array. In C++ the address of an element in the array can be computed from (address of first element) + i, where i is the index starting at 0 (a[1] == *(a + 1)). Notice here that "(address of the first element) + i" is not a literal addition of numbers. Different types of data have different sizes and the compiler will correctly take this into account. Therefore, it is simpler for the pointer arithmetic if the index started at 0.

Why no bounds checking on array indexes?
edit

C++ does allow for, but doesn't force, bounds-checking implementations, in practice little or no checking is done. It affects storage requirements (needing "fat pointers") and impacts runtime performance. However, the std::vector template class, that we mentioned and we will examine later in greater detail (a template class container, representing an array provides the at() method) which does enforce bounds checking. Also in many implementations, the standard containers include particularly complete bounds checking in debug mode. They might not support these checks in release builds, as any performance reduction in container classes relative to built-in arrays might prevent programmers from migrating from arrays to the more modern, safer container classes.

Note:
Some compilers, or external tools, can help detect issues outside of the language specifications, even in an automated faction. See the section about debugging for more information.