Polymorphic Data Structures in C/Pointers

← Introduction to C Constructs | Recursion →

Pointers are one of the most essential constructs in C. A pointer is a variable that stores the address (in memory) of another variable for reference by a function. Through the use of pointers, we are able to achieve a higher level of data manipulation, since modifying static variables in functions outside the ones that created them is not supported by any algorithm developed in C.

Simple Pointer Operations edit

Developers can explicitly declare pointers using the pointer operator (*):

int *p_int ;

The standard convention used in this book is to prepend the variable being pointed to with "p_" to designate its status as a pointer. This is helpful when there are multiple variables of the same apparent type in a function, where some are pointers and some are not. In order to access the information pointed to by a pointer, a programmer must use the dereference operator (also the asterisk) to denote the reference.

printf( "%d" , *p_int ) ;

If a variable is not a pointer, but needs to be modified inside a called function, then the address-of operator (&) can be used to create an "on-the-fly" pointer that is passed to the function. Many programmers understand that the scanf() function needs the address-of operator in front of the variables read into, but not all of them know why. It is because scanf() needs to affect the variable where the information is being stored.

Pointers are usually used for keeping track of dynamically allocated memory. In this demonstration, the pointer is redundant, but it will illustrate the principles of pointers.

#include <stdio.h>

int main( int argc , char *argv[] ) {

   int i ;
   int *p_i = &i ;

   i = 5 ;
   printf( "%d " , i ) ;

   *p_i = 7 ;
   printf( "%d\n" , *p_i ) ;

   return 0 ;

}

First, an integer i is created as a static variable of main(). Then, the pointer p_i is created and its value is set to the address of i. Next, i is assigned the value of 5 and printed. Finally, the integer that p_i points to is assigned the value of 7 and printed.

Notice the use of dereference and address-of operators. The proper use of pointer, dereference and address-of operators is key in an understanding of polymorphism.

Dynamic Memory Allocation edit

Perhaps the most important use of pointers is to keep track of memory that has been allocated dynamically (at run-time). Memory is allocated dynamically using the malloc() function. malloc() has one argument (the amount of space to be allocated, in bytes) and returns a pointer to a void, which is usually recast into a pointer to some other type. The challenge with dynamic memory allocation is that type sizes can be machine-dependent, meaning an integer on one computer may not be the same size in memory as an integer on another. To overcome this, C describes a sizeof operator which takes a type name as a parameter and returns the size, in bytes, of that type as an integer. For example, in order to dynamically allocate space in memory for one character, a developer would write:

char *p_char ;
p_char = (char *)malloc( sizeof ( char ) ) ;

More complexly, dynamic memory allocation is usually used when the amount of space to be reserved is not known at the time of compilation. In the following program, the user is asked how many integers they would like to store, and stores them.

#include <stdio.h>
#include <stdlib.h>

int main( int argc , char *argv[] ) {

   int i , amount , *p_int ;

   printf( "How many integers would you like to store? " ) ;
   scanf( "%d" , &amount ) ;

   p_int = (int *)malloc( sizeof ( int ) * amount ) ;

   printf( "Please enter %d integers, separated with a space, to store. " , amount ) ;
   for ( i = 0 ; i < amount ; i++ ) scanf( "%d" , p_int[i] ) ;

   for ( i = 0 ; i < amount ; i++ ) printf( "%d " , p_int[i] ) ;
   printf( "\n" ) ;

   return 0 ;

}

Notice how stdlib was included in the header. malloc() is declared in stdlib.h, so it has to be included along with stdio.h in programs that use it.

Notice also how the space reserved by malloc() is referred to using the standard syntax for arrays. This is because malloc() reserves space sequentially (all in a row), so pointer arithmetic (discussed below) can be applied to it. In other words, when malloc() reserves space for more than one of a given type, it allocates space for an array of that type.

Pointer Operations on Structures edit

In addition to the standard member operator (.), ANSI C includes a different member operator symbol, ->. It performs two operations at the same time: it dereferences the structure name before the operator and evaluates its member (after the operator). This is helpful when using pointers to refer to the structure. Using the employee data structure from the previous chapter:

struct employee_data *p_search ;
p_search = (struct employee_data *)malloc( sizeof ( struct employee_data ) ) ;

strcpy(p_search -> name, "Ryan");
strcpy((*p_search).name, "Ryan");

The bottom two statements perform exactly the same operations. Because we will be working with pointers to structures for the majority of this book, the dereference-member operator will be used more often. Note that sub-members must still be referred to using the normal member operator.

Using Pointers to Return Values edit

Functions in C can only return one value. This is an unfortunate consequence of the procedural programming paradigm. However, sometimes multiple variables need to be changed. In order to do that, one must pass a pointer to that variable to the function called. For more on this topic, please refer to Appendix 1, Function Invocation. The information there is extremely valuable, make sure to go through it before moving on. The remainder of this book assumes that the reader has read and understands the contents of Appendix 1.

Pointer Arithmetic edit

Performing pointer arithmetic is a very simple concept, but a somewhat difficult practice. Performing pointer addition on arrays is done implicitly, with the subscript operators []. The following statement would return TRUE:

int A[4] ;
( A[2] == *(A + 2) ) ;

The addition to a pointer causes the reference address to be "moved forward" by the specified number of units, multiplied by the size of the type. So, &A[2] is the location of A[0], plus the size of two integers in memory (thus giving you the third item in the array, since arrays start indexing at 0). Pointer arithmetic can be applied to unions and structures as well, but this method is not used as often in favor of member referencing.

Pointers to Functions edit

A very powerful feature of the C language is the ability to take the address of a function and store it as a pointer. In this manner, one could theoretically write a function, then have that function call a different function that is known only at runtime (decreasing the amount of conditional programming needed). Every C programmer is familiar with the standard function declaration syntax:

return_type function( parameter_1 , parameter_2 ... ) ;

Similar to variables, functions are also typed. Assuming they are not of type void, all functions return data, and that data must have type. As a side effect of this structure, a programmer can declare pointers to functions.

return_type (*p_function)( parameter_1 , parameter_2 ... ) ;

This code can be used to execute any number of functions in the same line of code, demonstrated below in this text manipulation program.

#include <stdio.h>

void print_reverse( char *string ) ;
void print_normal( char *string ) ;
int main( int argc , char *argv[] ) {

   char input_string[32] ;
   int state , i ;
   void (*p_function)( char *string ) ;

   for ( i = 0 ; i < 32 ; i++ )
      input_string[i] = '\0' ;

   printf( "Please enter a short line of text: " ) ;
   scanf( "%s" , input_string ) ;

   printf( "Would you like to print this string in reverse? 1 for yes, 0 for no. " ) ;
   scanf( "%d" , &state ) ;

   if ( state == 1 )
      p_function = print_reverse ;
   else if ( state == 0 )
      p_function = print_normal ;
   else {
      printf( "Error: Must enter 0 or 1. Exiting." ) ;
      exit 1; 
   }

   p_function( input_string ) ;

   return 0 ;

}
void print_reverse( char *string ) {

   int i ;

   for ( i = 31 ; i >= 0 ; i-- ) {
      if ( string[i] != '\0' )
         putchar( string[i] ) ;
   }

   printf( "\n" ) ;

}
void print_normal( char *string ) {

   int i = 0 ;

   while ( ( i < 32 ) && ( string[i] != '\0' ) ) {
      putchar( string[i] ) ;
      i++ ;
   }

   printf( "\n" ) ;

}

In this program, there are two functions, print_reverse() and print_normal(). One simply prints the string, and the other prints the string's contents out in reverse order. But the actual function call inside main() is the same in both cases:

p_function( input_string ) ;

This is made possible through C's ability to reference functions by pointers to that function, and to properly pass parameters to that function.