C++ Programming/Operators/Arrays
Arrays
editAn array stores a constant-sized sequential set of blocks, each block containing a value of the selected type under a single name. Arrays often help organize collections of data efficiently and intuitively.
It is easiest to think of an array as simply a list with each value as an item of the list. Where individual elements are accessed by their position in the array called its index, also known as subscript. Each item in the array has an index from 0 to (the size of the array) -1, indicating its position in the array.
Advantages of arrays include:
- Random access in O(1) (Big O notation)
- Ease of use/port: Integrated into most modern languages
Disadvantages include:
- Constant size
- Constant data-type
- Large free sequential block to accommodate large arrays
- When used as non-static data members, the element type must allow default construction
- Arrays do not support copy assignment (you cannot write
arraya = arrayb
) - Arrays cannot be used as the value type of a standard container
- Syntax of use differs from standard containers
- Arrays and inheritance don't mix (an array of Derived is not an array of Base, but can too easily be treated like one)
For example, here is an array of integers, called List with 5 elements, numbered 0 to 4. Each element of the array is an integer. Like other integer variables, the elements of the array start out uninitialized. That means it is filled with unknown values until we initialize it by assigning something to it. (Remember primitive types in C are not initialized to 0.)
Index | Data |
00 | unspecified |
01 | unspecified |
02 | unspecified |
03 | unspecified |
04 | unspecified |
Since an array stores values, what type of values and how many values to store must be defined as part of an array declaration, so it can allocate the needed space. The size of array must be a const integral expression greater than zero. That means that you cannot use user input to declare an array. You need to allocate the memory (with operator new[]), so the size of an array has to be known at compile time. Another disadvantage of the sequential storage method is that there has to be a free sequential block large enough to hold the array. If you have an array of 500,000,000 blocks, each 1 byte long, you need to have roughly 500 megabytes of sequential space to be free; Sometimes this will require a defragmentation of the memory, which takes a long time.
To declare an array you can do:
int numbers[30]; // creates an array of 30 integers
or
char letters[4]; // create an array of 4 characters
and so on...
to initialize as you declare them you can use:
int vector[6]={0,0,1,0,0,0};
this will not only create the array with 6 int elements but also initialize them to the given values.
If you initialize the array with less than the full number of elements, the remaining elements are set to a default value - zero in the case of numbers.
int vector[6]={0,0,1}; // this is the same as the example above
If you fully initialize the array as you declare it, you can allow the compiler to work out the size of the array:
int vector[]={0,0,1,0,0,0}; // the compiler can see that there are 6 elements
Assigning and accessing data
editYou can assign data to the array by using the name of the array, followed by the index.
For example to assign the number 200 into the element at index 2 in the array
List[2] = 200;
will give
Index | Data |
00 | unspecified |
01 | unspecified |
02 | 200 |
03 | unspecified |
04 | unspecified |
You can access the data at an element of the array the same way.
std::cout << List[2] << std::endl;
This will print 200.
Basically working with individual elements in an array is no different then working with normal variables.
As you see accessing a value stored in an array is easy. Take this other example:
int x;
x = vector[2];
The above declaration will assign x the valued store at index 2 of variable vector which is 1.
Arrays are indexed starting at 0, as opposed to starting at 1. The first element of the array above is vector[0]. The index to the last value in the array is the array size minus one. In the example above the subscripts run from 0 through 5. C++ does not do bounds checking on array accesses. The compiler will not complain about the following:
char y;
int z = 9;
char vector[6] = { 1, 2, 3, 4, 5, 6 };
// examples of accessing outside the array. A compile error is not raised
y = vector[15];
y = vector[-4];
y = vector[z];
During program execution, an out of bounds array access does not always cause a run time error. Your program may happily continue after retrieving a value from vector[-1]. To alleviate indexing problems, the sizeof
expression is commonly used when coding loops that process arrays.
int ix;
short anArray[]= { 3, 6, 9, 12, 15 };
for (ix=0; ix< (sizeof(anArray)/sizeof(short)); ++ix) {
DoSomethingWith( anArray[ix] );
}
Notice in the above example, the size of the array was not explicitly specified. The compiler knows to size it at 5 because of the five values in the initializer list. Adding an additional value to the list will cause it to be sized to six, and because of the sizeof
expression in the for
loop, the code automatically adjusts to this change.
multidimensional arrays
editYou can also use multi-dimensional arrays. The simplest type is a two dimensional array. This creates a rectangular array - each row has the same number of columns. To get a char array with 3 rows and 5 columns we write...
char two_d[3][5];
To access/modify a value in this array we need two subscripts:
char ch;
ch = two_d[2][4];
or
two_d[0][0] = 'x';
example
editThere are also weird notations possible:
int a[100];
int i = 0;
if (a[i]==i[a])
printf("Hello World!\n");
a[i] and i[a] point to the same location. You will understand this better after knowing about pointers.
To get an array of a different size, you must explicitly deal with memory using realloc, malloc, memcpy, etc.
Why start at 0?
editMost programming languages number arrays from 0. This is useful in languages where arrays are used interchangeably with a pointer to the first element of the array. In C++ the address of an element in the array can be computed from (address of first element) + i, where i is the index starting at 0 (a[1] == *(a + 1)). Notice here that "(address of the first element) + i" is not a literal addition of numbers. Different types of data have different sizes and the compiler will correctly take this into account. Therefore, it is simpler for the pointer arithmetic if the index started at 0.
Why no bounds checking on array indexes?
editC++ does allow for, but doesn't force, bounds-checking implementations, in practice little or no checking is done. It affects storage requirements (needing "fat pointers") and impacts runtime performance. However, the std::vector template class, that we mentioned and we will examine later in greater detail (a template class container, representing an array provides the at() method) which does enforce bounds checking. Also in many implementations, the standard containers include particularly complete bounds checking in debug mode. They might not support these checks in release builds, as any performance reduction in container classes relative to built-in arrays might prevent programmers from migrating from arrays to the more modern, safer container classes.