C++ Programming/Code/Statements/Functions

Functions

edit

A function, which can also be referred to as subroutine, procedure, subprogram or even method, carries out tasks defined by a sequence of statements called a statement block that need only be written once and called by a program as many times as needed to carry out the same task.

Functions may depend on variables passed to them, called arguments, and may pass results of a task on to the caller of the function, this is called the return value.

It is important to note that a function that exists in the global scope can also be called global function and a function that is defined inside a class is called a member function. (The term method is commonly used in other programming languages to refer to things like member functions, but this can lead to confusion in dealing with C++ which supports both virtual and non-virtual dispatch of member functions.)

Note:
When talking or reading about programming, you must consider the language background and the topic of the source. It is very rare to see a C++ programmer use the words procedure or subprogram, this will vary from language to language. In many programming languages the word function is reserved for subroutines that return a value, this is not the case with C++.

Declarations

edit

A function must be declared before being used, with a name to identify it, what type of value the function returns and the types of any arguments that are to be passed to it. Parameters must be named and declare what type of value it takes. Parameters should always be passed as const if their arguments are not modified. Usually functions performs actions, so the name should make clear what it does. By using verbs in function names and following other naming conventions programs can be read more naturally.

The next example we define a function named main that returns an integer value int and takes no parameters. The content of the function is called the body of the function. The word int is a keyword. C++ keywords are reserved words, i.e., cannot be used for any purpose other than what they are meant for. On the other hand main is not a keyword and you can use it in many places where a keyword cannot be used (though that is not recommended, as confusion could result).

int main()
{
  // code
  return 0;
}
 

To do:
Merge and spread the info


The inline keyword declares an inline function, the declaration is a (non-binding) request to the compiler that a particular function be subjected to in-line expansion; that is, it suggests that the compiler insert the complete body of the function in every context where that function is used and so it is used to avoid the overhead implied by making a CPU jump from one place in code to another and back again to execute a subroutine, as is done in naive implementations of subroutines.

inline swap( int& a, int& b) { int const tmp(b); b=a; a=tmp; }

When a function definition is included in a class/struct definition, it will be an implicit inline, the compiler will try to automatically inline that function. No inline keyword is necessary in this case; it is legal, but redundant, to add the inline keyword in that context, and good style is to omit it.

Example:

struct length
{
  explicit length(int metres) : m_metres(metres) {}
  operator int&() { return m_metres; }
  private:
  int m_metres;
};

Inlining can be an optimization, or a pessimization. It can increase code size (by duplicating the code for a function at multiple call sites) or can decrease it (if the code for the function, after optimization, is less than the size of the code needed to call a non-inlined function). It can increase speed (by allowing for more optimization and by avoiding jumps) or can decrease speed (by increasing code size and hence cache misses).

One important side-effect of inlining is that more code is then accessible to the optimizer.

Marking a function as inline also has an effect on linking: multiple definitions of an inline function are permitted (so long as each is in a different translation unit) so long as they are identical. This allows inline function definitions to appear in header files; defining non-inlined functions in header files is almost always an error (though function templates can also be defined in header files, and often are).

Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any suitable function, even those that are not marked as inline functions. A compiler is often in a better position than a human to decide whether a particular function should be inlined; in particular, the compiler may not be willing or able to inline many functions that the human asks it to.

Excessive use of inlined functions can greatly increase coupling/dependencies and compilation time, as well as making header files less useful as documentation of interfaces.


Normally when calling a function, a program will evaluate and store the arguments, and then call (or branch to) the function's code, and then the function will later return to the caller. While function calls are fast (typically taking much less than a microsecond on modern processors), the overhead can sometimes be significant, particularly if the function is simple and is called many times.

One approach which can be a performance optimization in some situations is to use so-called inline functions. Marking a function as inline is a request (sometimes called a hint) to the compiler to consider replacing a call to the function by a copy of the code of that function.

The result is in some ways similar to the use of the #define macro, but as mentioned before, macros can lead to problems since they are not evaluated by the preprocessor. inline functions do not suffer from the same problems.

If the inlined function is large, this replacement process (known for obvious reasons as "inlining") can lead to "code bloat", leading to bigger (and hence usually slower) code. However, for small functions it can even reduce code size, particularly once a compiler's optimizer runs.

Note that the inlining process requires that the function's definition (including the code) must be available to the compiler. In particular, inline headers that are used from more than one source file must be completely defined within a header file (whereas with regular functions that would be an error).

The most common way to designate that a function is inline is by the use of the inline keyword. One must keep in mind that compilers can be configured to ignore the keyword and use their own optimizations.

Further considerations are given when dealing with inline member function, this will be covered on the Object-Oriented Programming Chapter .


 

To do:
Complete and give examples


Parameters and arguments

edit

The function declaration defines its parameters. A parameter is a variable which takes on the meaning of a corresponding argument passed in a call to a function.

An argument represents the value you supply to a function parameter when you call it. The calling code supplies the arguments when it calls the function.

The part of the function declaration that declares the expected parameters is called the parameter list and the part of function call that specifies the arguments is called the argument list.

//Global functions declaration
int subtraction_function( int parameter1, int parameter2 ) { return ( parameter1 - parameter2 ); }

//Call to the above function using 2 extra variables so the relation becomes more evident
int argument1 = 4;
int argument2 = 3; 
int result = subtraction_function( argument1, argument2 );
// will have the same result as
int result = subtraction_function( 4, 3 );

Many programmers use parameter and argument interchangeably, depending on context to distinguish the meaning. In practice, distinguishing between the two terms is usually unnecessary in order to use them correctly or communicate their use to other programmers. Alternatively, the equivalent terms formal parameter and actual parameter may be used instead of parameter and argument.

Parameters

edit

You can define a function with no parameters, one parameter, or more than one, but to use a call to that function with arguments you must take into consideration what is defined.

Empty parameter list

edit
//Global functions with no parameters
void function() { /*...*/ }
//empty parameter declaration equivalent the use of void
void function( void ) { /*...*/ }

Note:
This is the only valid case were void can be used as a parameter type, you can only derived types from void (ie: void* ).

Multiple parameters

edit

The syntax for declaring and invoking functions with multiple parameters can be a source of errors. When you write the function definition, you must declare the type of each and every parameter.

// Example - function using two int parameters by value
void printTime (int hour, int minute) { 
  std::cout << hour; 
  std::cout << ":"; 
  std::cout << minute; 
}

It might be tempting to write (int hour, minute), but that format is only legal for variable declarations, not for parameter declarations.

However, you do not have to declare the types of arguments when you call a function. (Indeed, it is an error to attempt to do so).

Example

int main ( void ) {
    int hour = 11; 
    int minute = 59; 
    printTime( int hour, int minute ); // WRONG!
    printTime( hour, minute ); // Right!
}

In this case, the compiler can tell the type of hour and minute by looking at their declarations. It is unnecessary and illegal to include the type when you pass them as arguments..

by pointer

edit

A function may use pass by pointer when the object pointed to might not exist, that is, when you are giving either the address of a real object or NULL. Passing a pointer is not different to passing anything else. Its a parameter the same as any other. The characteristics of the pointer type is what makes it a worth distinguishing.

The passing of a pointer to a function is very similar to passing it as a reference. It is used to avoid the overhead of copying, and the slicing problem (since child classes have a bigger memory footprint that the parent) that can occur when passing base class objects by value. This is also the preferred method in C (for historical reasons), were passing by pointer signifies that wanted to modify the original variable. In C++ it is preferred to use references to pointers and guarantee that the function before dereferencing it, verifies the pointer for validity.


 

To do:
Reorder, simplify and clarify


#include <iostream>

void MyFunc( int *x ) 
{ 
  std::cout << *x << std::endl; // See next section for explanation
} 
  
int main() 
{ 
  int i; 
  MyFunc( &i );

  return 0; 
}

Since a reference is just an alias, it has exactly the same address as what it refers to, as in the following example:

#include <iostream>

void ComparePointers (int * a, int * b)
{
  if (a == b)
    std::cout<<"Pointers are the same!"<<std::endl;
  else
    std::cout<<"Pointers are different!"<<std::endl;
}

int main()
{
  int i, j;
  int& r = i;

  ComparePointers(&i, &i);
  ComparePointers(&i, &j);
  ComparePointers(&i, &r);
  ComparePointers(&j, &r);

  return 0;
}

This program will tell you that the pointers are the same, then that they are different, then the same, then different again.

Arrays are similar to pointers, remember?

Now might be a good time to reread the section on arrays. If you do not feel like flipping back that far, though, here's a brief recap: Arrays are blocks of memory space.

int my_array[5];

In the statement above, my_array is an area in memory big enough to hold five ints. To use an element of the array, it must be dereferenced. The third element in the array (remember they're zero-indexed) is my_array[2]. When you write my_array[2], you're actually saying "give me the third integer in the array my_array". Therefore, my_array is an array, but my_array[2] is an int.

Passing a single array element

So let's say you want to pass one of the integers in your array into a function. How do you do it? Simply pass in the dereferenced element, and you'll be fine.

Example

#include <iostream>

void printInt(int printable){
  std::cout << "The int you passed in has value " << printable << std::endl;
}
int main(){
  int my_array[5];
  
  // Reminder: always initialize your array values!
  for(int i = 0; i < 5; i++)
    my_array[i] = i * 2;
  
  for(int i = 0; i < 5; i++)
    printInt(my_array[i]); // <-- We pass in a dereferenced array element
}

This program outputs the following:

The int you passed in has value 0
The int you passed in has value 2
The int you passed in has value 4
The int you passed in has value 6
The int you passed in has value 8

This passes array elements just like normal integers, because array elements like my_array[2] are integers.

Passing a whole array

Well, we can pass single array elements into a function. But what if we want to pass a whole array? We can not do that directly, but you can treat the array as a pointer.

Example

#include <iostream>

void printIntArr(int *array_arg, int array_len){
  std::cout << "The length of the array is " << array_len << std::endl;
  for(int i = 0; i < array_len; i++)
    std::cout << "Array[" << i << "] = " << array_arg[i] << std::endl;
}
 
int main(){
  int my_array[5];

  // Reminder: always initialize your array values!
  for(int i = 0; i < 5; i++)
    my_array[i] = i * 2;
  
  printIntArr(my_array, 5);
}

Note:
Due to array-pointer interchangeability in the context of parameter declarations only, we can also declare pointers as arrays in function parameter lists. It is treated identically. For example, the first line of the function above can also be written as

void printIntArr(int array_arg[], int array_len)

It is important to note that even if it is written as int array_arg[], the parameter is still a pointer of type int *. It is not an array; an array passed to the function will still be automatically converted to a pointer to its first element.

This will output the following:

The length of the array is 5
Array[0] = 0
Array[1] = 2
Array[2] = 4
Array[3] = 6
Array[4] = 8

As you can see, the array in main is accessed by a pointer. Now here's some important points to realize:

  • Once you pass an array to a function, it is converted to a pointer so that function has no idea how to guess the length of the array. Unless you always use arrays that are the same size, you should always pass in the array length along with the array.
  • You've passed in a POINTER. my_array is an array, not a pointer. If you change array_arg within the function, my_array does not change (i.e., if you set array_arg to point to a new array). But if you change any element of array_arg, you're changing the memory space pointed to by array_arg, which is the array my_array.


 

To do:
Passing a single element (by value vs. by reference), passing the whole array (always by reference), passing as const


by reference

edit

The same concept of references is used when passing variables.

Example

void foo( int &i )
{
  ++i;
}
 
int main()
{
  int bar = 5;   // bar == 5
  foo( bar );    // increments bar
  std::cout << bar << std::endl   // 6
 
  return 0;
}

Example2: to swap the values of two integers, we could write:

void swap (int& x, int& y) 
{ 
  int temp = x; 
  x = y; 
  y = temp; 
}

In the call of this function we give two variables of type int:

int i = 7; 
int j = 9; 
swap (i, j); 
cout << i << ' ' << j << endl;

The output of this is '9 7'. Draw a stack diagram for this program to convince yourself this is true. If the parameters x and y were declared as regular parameters (without the '&'s), swap would not work. It would modify x and y in function swap only and have no effect on i and j.

When people start passing things like integers by reference, they often try to use an expression as a reference argument. For example:

int i = 7; 
int j = 9; 
swap (i, j+1); // WRONG!!

This is not legal because the expression j+1 is not a variable — it does not occupy a location that the reference can refer to. It is a little tricky to figure out exactly what kinds of expressions can be passed by reference. For now, a good rule of thumb is that reference arguments have to be variables.

Here we display one of the two common uses of references in function arguments—they allow us to use the conventional syntax of passing an argument by value but manipulate the value in the caller.

Note:
If the parameter is a non-const reference, the caller expects it to be modified. If the function does not want to modify the parameter, a const reference should be used instead.

However there is a more common use of references in function arguments—they can also be used to pass a handle to a large data structure without making multiple copies of it in the process. Consider the following:

void foo( const std::string & s ) // const reference, explained below
{
  std::cout << s << std::endl;
}

void bar( std::string s )
{
  std::cout << s << std::endl;
}

int main()
{
  std::string const text = "This is a test.";

  foo( text ); // doesn't make a copy of "text"
  bar( text ); // makes a copy of "text"

  return 0;
}

In this simple example we're able to see the differences in pass by value and pass by reference. In this case pass by value just expends a few additional bytes, but imagine for instance if text contained the text of an entire book.

The reason why we use a constant reference instead of a reference is the user of this function can assure that the value of the variable passed does not change within the function. We technically call this "const-to-reference".

The ability to pass it by reference keeps us from needing to make a copy of the string and avoids the ugliness of using a pointer.

Note:
It should also be noted that "const-to-reference" only makes sense for complex types -- classes and structs. In the case of ordinal types -- i.e. int, float, bool, etc. -- there is no savings in using a reference instead of simply using pass by value, and indeed the extra costs associated with indirection may make code using a reference slower than code that copies small objects.

Passing an array of fixed-length by using reference

In some cases, a function requires an array of a specific length to work:

void func(int(&parameter)[4]);

Unlike an array changing into a pointer, the parameter is not a PLAIN array that can be changed into a pointer, but rather a reference to array with 4 ints. Therefore, only an array of 4 ints, not array of any other length, not pointer to int, can be passed into this function. This helps you prevent buffer overflow errors because the array object is ALWAYS allocated unless you circumvent the type system by casting.

It can be used to pass an array without specifying the number of elements manually:

template<int n>void func(int(&para)[n]);

The compiler generates the value of length at compile time, inside the function, n stores the number of elements. However, the use of template generates code bloat.

In C++, a multi-dimensional array cannot be converted to a multi-level pointer, therefore, the code below is invalid:

// WRONG
void foo(int**matrix,int n,int m);
int main(){
	int data[10][5];
	// do something on data
	foo(data,10,5);
}

Although an int[10][5] can be converted to an (*int)[5], it cannot be converted to int**. Therefore you may need to hard-code the array bound in the function declaration:

// BAD
void foo(int(*matrix)[5],int n,int m);
int main(){
	int data[10][5];
	// do something on data
	foo(data,10,5);
}

To make the function more generic, template and function overloading should be used:

// GOOD
template<int junk,int rubbish>void foo(int(&matrix)[junk][rubbish],int n,int m);
void foo(int**matrix,int n,int m);
int main(){
	int data[10][5];
	// do something on data
	foo(data,10,5);
}

The reason for having n and m in the first version is mainly for consistency, and also deal with the case that the array allocated is not used completely. It may also be used for checking buffer overflows by comparing n/m with junk/rubbish.

by value

edit

When we want to write a function which the value of the argument is independent to the passed variable, we use pass-by-value approach.

int add(int num1, int num2)
{
 num1 += num2; // change of value of "num1"
 return num1;
}

int main()
{
 int a = 10, b = 20, ans;
 ans = add(a, b);
 std::cout << a << " + " << b << " = " << ans << std::endl;
}

Output:

10 + 20 = 30

The above example shows a property of pass-by-value, the arguments are copies of the passed variable and only in the scope of the corresponding function. This means that we have to afford the cost of copying. However, this cost is usually considered only for larger and more complex variables.
In this case, the values of "a" and "b" are copied to "num1" and "num2" on the function "add()". We can see that the value of "num1" is changed in line 3. However, we can also observe that the value of "a" is kept after passed to this function.

Constant Parameters

edit

The keyword const can also be used as a guarantee that a function will not modify a value that is passed in. This is really only useful for references and pointers (and not things passed by value), though there's nothing syntactically to prevent the use of const for arguments passed by value.

Take for example the following functions:

void foo(const std::string &s)
{
 s.append("blah"); // ERROR -- we can't modify the string
 
 std::cout << s.length() << std::endl; // fine
}
 
void bar(const Widget *w)
{
 w->rotate(); // ERROR - rotate wouldn't be const
 
 std::cout << w->name() << std::endl; // fine
}

In the first example we tried to call a non-const method -- append() -- on an argument passed as a const reference, thus breaking our agreement with the caller not to modify it and the compiler will give us an error.

The same is true with rotate(), but with a const pointer in the second example.

Default values

edit

Parameters in C++ functions (including member functions and constructors) can be declared with default values, like this

int foo (int a, int b = 5, int c = 3);

Then if the function is called with fewer arguments (but enough to specify the arguments without default values), the compiler will assume the default values for the missing arguments at the end. For example, if I call

foo(6, 1)

that will be equivalent to calling

foo(6, 1, 3)

In many situations, this saves you from having to define two separate functions that take different numbers of parameters, which are almost identical except for a default value.

The "value" that is given as the default value is often a constant, but may be any valid expression, including a function call that performs arbitrary computation.

Default values can only be given for the last arguments; i.e. you cannot give a default value for a parameter that is followed by a parameter that does not have a default value, since it will never be used.

Once you define the default value for a parameter in a function declaration, you cannot re-define a default value for the same parameter in a later declaration, even if it is the same value.

Ellipsis (...) as a parameter

edit

If the parameter list ends with an ellipsis, it means that the arguments number must be equal or greater than the number of parameters specified. It will in fact create a variadic function, a function of variable arity; that is, one which can take different numbers of arguments.


 

To do:
Mention printf, <cstdarg> and check declaration


Note:

The variadic function feature is going to be readdressed in the upcoming C++ language standard, C++0x; with the possible inclusion of variatic macros and the ability to create variadic template classes and variadic template functions. Variadic templates will finally allow the creation of true tuple classes in C++.

Returning values

edit

When declaring a function, you must declare it in terms of the type that it will return, this is done in three steps, in the function declaration, the function implementation (if distinct) and on the body of the same function with the return keyword.

Functions with results

You might have noticed by now that some of the functions yield results. Other functions perform an action but don't return a value.

Other ways to get a value from a function is to use a pointer or a reference as argument or use a global variable

Get more that a single value from a function

The return type determines the capacity, any type will work from an array or a std::vector, a struct or a class, it is only restricted by the return type you chose.

That raises some questions
  • What happens if you call a function and you don't do anything with the result (i.e. you don't assign it to a variable or use it as part of a larger expression)?
  • What happens if you use a function without a result as part of an expression, like newLine() + 7?
  • Can we write functions that yield results, or are we stuck with things like newLine and printTwice?

The answer to the third question is "yes, you can write functions that returns values,". For now I will leave it up to you to answer the other two questions by trying them out. Any time you have a question about what is legal or illegal in C++, a first step to find out is to ask the compiler. However you should be aware of two issues, that we already mentioned when introducing the compiler: First a compiler may have bugs just like any other software, so it happens that not every source code which is forbidden in C++ is properly rejected by the compiler, and vice versa. The other issue is even more dangerous: You can write programs in C++ which a C++ implementation is not required to reject, but whose behavior is not defined by the language. Needless to say, running such a program can, and occasionally will, do harmful things to the system it is running or produce corrupt output!

For example:

int MyFunc(); // returns an int
SOMETYPE MyFunc(); // returns a SOMETYPE

int* MyFunc(); // returns a pointer to an int
SOMETYPE *MyFunc(); // returns a pointer to a SOMETYPE
SOMETYPE &MyFunc(); // returns a reference to a SOMETYPE

If you have understood the syntax of pointer declarations, the declaration of a function that returns a pointer or a reference should seem logical. The above piece of code shows how to declare a function that will return a reference or a pointer; below are outlines of what the definitions (implementations) of such functions would look like:

SOMETYPE *MyFunc(int *p) 
{ 
  //... 

  return p; 
} 

SOMETYPE &MyFunc(int &r) 
{ 
  //... 

  return r; 
}

The return statement causes execution to jump from the current function to whatever function called the current function. An optional a result (return variable) can be returned. A function may have more than one return statement (but returning the same type).

Syntax
return;
return value;

Within the body of the function, the return statement should NOT return a pointer or a reference that has the address in memory of a local variable that was declared within the function, because as soon as the function exits, all local variables are destroyed and your pointer or reference will be pointing to some place in memory which you no longer own, so you cannot guarantee its contents. If the object to which a pointer refers is destroyed, the pointer is said to be a dangling pointer until it is given a new value; any use of the value of such a pointer is invalid. Having a dangling pointer like that is dangerous; pointers or references to local variables must not be allowed to escape the function in which those local (aka automatic) variables live.

However, within the body of your function, if your pointer or reference has the address in memory of a data type, struct, or class that you dynamically allocated the memory for, using the new operator, then returning said pointer or reference would be reasonable:

SOMETYPE *MyFunc()  //returning a pointer that has a dynamically 
{           //allocated memory address is valid code 
  int *p = new int[5]; 

  //... 

  return p; 
}

In most cases, a better approach in that case would be to return an object such as a smart pointer which could manage the memory; explicit memory management using widely distributed calls to new and delete (or malloc and free) is tedious, verbose and error prone. At the very least, functions which return dynamically allocated resources should be carefully documented. See this book's section on memory management for more details.

const SOMETYPE *MyFunc(int *p) 
{
  //... 

  return p; 
}

In this case the SOMETYPE object pointed to by the returned pointer may not be modified, and if SOMETYPE is a class then only const member functions may be called on the SOMETYPE object.

If such a const return value is a pointer or a reference to a class then we cannot call non-const methods on that pointer or reference since that would break our agreement not to change it.

Note:
As a general rule methods should be const except when it's not possible to make them such. While getting used to the semantics you can use the compiler to inform you when a method may not be const -- it will (usually) give an error if you declare a method const that needs to be non-const.

Static returns

edit

When a function returns a variable (or a pointer to one) that is statically located, one must keep in mind that it will be possible to overwrite its content each time a function that uses it is called. If you want to save the return value of this function, you should manually save it elsewhere. Most such static returns use global variables.

Of course, when you save it elsewhere, you should make sure to actually copy the value(s) of this variable to another location. If the return value is a struct, you should make a new struct, then copy over the members of the struct.

One example of such a function is the Standard C Library function localtime.

Return "codes" (best practices)

edit

There are 2 kinds of behaviors :

Note:
The selection of, and consistent use of this practice helps to avoid simple errors. Personal taste or organizational dictates may influence the decision, but a general rule-of-thumb is that you should follow whatever choice has been made in the code base you are currently working in. However, there may be valid reasons for making a different choice in any particular situation.

Positive means success
edit

This is the "logical" way to think, and as such the one used by almost all beginners. In C++, this takes the form of a boolean true/false test, where "true" (also 1 or any non-zero number) means success, and "false" (also 0) means failure.

The major problem of this construct is that all errors return the same value (false), so you must have some kind of externally visible error code in order to determine where the error occurred. For example:

 bool bOK;
 if (my_function1())
 {
     // block of instruction 1
     if (my_function2())
     {
         // block of instruction 2
         if (my_function3())
         {
              // block of instruction 3
              // Everything worked
              error_code = NO_ERROR;
              bOK = true;
         }
         else
         {
              //error handler for function 3 errors
              error_code = FUNCTION_3_FAILED;
              bOK = false;
         }
     }
     else
     {
         //error handler for function 2 errors
         error_code = FUNCTION_2_FAILED;
         bOK = false;
     }
 }
 else
 {
     //error handler for function 1 errors
     error_code = FUNCTION_1_FAILED;
     bOK = false;
 }
 return bOK;

As you can see, the else blocks (usually error handling) of my_function1 can be really far from the test itself; this is the first problem. When your function begins to grow, it's often difficult to see the test and the error handling at the same time.

This problem can be compensated by source code editor features such as folding, or by testing for a function returning "false" instead of true.

 if (!my_function1()) // or if (my_function1() == false) 
 {
     //error handler for function 1 errors

     //...

This can also make the code look more like the "0 means success" paradigm, but a little less readable.

The second problem of this construct is that it tends to break up logical tests (my_function2 is one level more indented, my_function3 is 2 levels indented) which causes legibility problems.

One advantage here is that you follow the structured programming principle of a function having a single entry and a single exit.

The Microsoft Foundation Class Library (MFC) is an example of a standard library that uses this paradigm.

0 means success
edit

This means that if a function returns 0, the function has completed successfully. Any other value means that an error occurred, and the value returned may be an indication of what error occurred.

The advantage of this paradigm is that the error handling is closer to the test itself. For example the previous code becomes:

 if (0 != my_function1())
 {
     //error handler for function 1 errors
     return FUNCTION_1_FAILED;
 }
 // block of instruction 1
 if (0 != my_function2())
 {
     //error handler for function 2 errors
     return FUNCTION_2_FAILED;
 }
 // block of instruction 2
 if (0 != my_function3())
 {
     //error handler for function 3 errors
     return FUNCTION_3_FAILED;
 }
 // block of instruction 3
 // Everything worked
 return 0; // NO_ERROR

In this example, this code is more readable (this will not always be the case). However, this function now has multiple exit points, violating a principle of structured programming.

The C Standard Library (libc) is an example of a standard library that uses this paradigm.

Note:
Some people argue that using functions results in a performance penalty. In this case just use inline functions and let the compiler do the work. Small functions mean visibility, easy debugging and easy maintenance.


Composition

edit

Just as with mathematical functions, C++ functions can be composed, meaning that you use one expression as part of another. For example, you can use any expression as an argument to a function:

double x = cos (angle + pi/2);

This statement takes the value of pi, divides it by two and adds the result to the value of angle. The sum is then passed as an argument to the cos function.

You can also take the result of one function and pass it as an argument to another:

double x = exp (log (10.0));

This statement finds the log base e of 10 and then raises e to that power. The result gets assigned to x; I hope you know what it is.

Recursion

edit

In programming languages, recursion was first implemented in Lisp on the basis of a mathematical concept that existed earlier on, it is a concept that allows us to break down a problem into one or more subproblems that are similar in form to the original problem, in this case, of having a function call itself in some circumstances. It is generally distinguished from iterators or loops.

A simple example of a recursive function is:

  void func(){
     func();
  }

It should be noted that non-terminating recursive functions as shown above are almost never used in programs (indeed, some definitions of recursion would exclude such non-terminating definitions). A terminating condition is used to prevent infinite recursion.

Example
  double power(double x, int n)
  {
   if (n < 0)
   {
      std::cout << std::endl
                << "Negative index, program terminated.";
      exit(1);
   }
   if (n)
      return x * power(x, n-1);
   else
      return 1.0;
  }

The above function can be called like this:

  x = power(x, static_cast<int>(power(2.0, 2)));

Why is recursion useful? Although, theoretically, anything possible by recursion is also possible by iteration (that is, while), it is sometimes much more convenient to use recursion. Recursive code happens to be much easier to follow as in the example below. The problem with recursive code is that it takes too much memory. Since the function is called many times, without the data from the calling function removed, memory requirements increase significantly. But often the simplicity and elegance of recursive code overrules the memory requirements.

The classic example of recursion is the factorial:  , where   by convention. In recursion, this function can be succinctly defined as

 unsigned factorial(unsigned n)
 {
   if (n != 0) 
   {
     return n * factorial(n-1);
   } 
   else 
   {
     return 1;
   }
 }

With iteration, the logic is harder to see:

 unsigned factorial2(unsigned n)
 {
   int a = 1;
   while(n > 0)
   {
     a = a*n;
     n = n-1;
   }
   return a;
 }

Although recursion tends to be slightly slower than iteration, it should be used where using iteration would yield long, difficult-to-understand code. Also, keep in mind that recursive functions take up additional memory (on the stack) for each level. Thus they can run out of memory where an iterative approach may just use constant memory.

Each recursive function needs to have a Base Case. A base case is where the recursive function stops calling itself and returns a value. The value returned is (hopefully) the desired value.

For the previous example,

 unsigned factorial(unsigned n)
 {
   if(n != 0) 
   {
     return n * factorial(n-1);
   } 
   else
   {
     return 1;
   }
 }

the base case is reached when  . In this example, the base case is everything contained in the else statement (which happens to return the number 1). The overall value that is returned is every value from   to   multiplied together. So, suppose we call the function and pass it the value  . The function then does the math   and returns 6 as the result of calling factorial(3).

Another classic example of recursion is the sequence of Fibonacci numbers:

0 1 1 2 3 5 8 13 21 34 ...

The zeroth element of the sequence is 0. The next element is 1. Any other number of this series is the sum of the two elements coming before it. As an exercise, write a function that returns the nth Fibonacci number using recursion.

main

edit

The function main also happens to be the entry point of any (standard-compliant) C++ program and must be defined. The compiler arranges for the main function to be called when the program begins execution. main may call other functions which may call yet other functions.

Note:
main also special because the user code is not allowed to call it; in particular, it cannot be directly or indirectly recursive. This is one of the many small ways in which C++ differs from C.

The main function returns an integer value. In certain systems, this value is interpreted as a success/failure code. The return value of zero signifies a successful completion of the program. Any non-zero value is considered a failure. Unlike other functions, if control reaches the end of main(), an implicit return 0; for success is automatically added. To make return values from main more readable, the header file cstdlib defines the constants EXIT_SUCCESS and EXIT_FAILURE (to indicate successful/unsuccessful completion respectively).

Note:
The ISO C++ Standard (ISO/IEC 14882:1998) specifically requires main to have a return type of int. But the ISO C Standard (ISO/IEC 9899:1999) actually does not, though most compilers treat this as a minor warning-level error.

The explicit use of return 0; (or return EXIT_SUCCESS;) to exit the main function is left to the coding style used.

The main function can also be declared like this:

int main(int argc, char **argv){
  // code
}

which defines the main function as returning an integer value int and taking two parameters. The first parameter of the main function, argc, is an integer value int that specifies the number of arguments passed to the program, while the second, argv, is an array of strings containing the actual arguments. There is almost always at least one argument passed to a program; the name of the program itself is the first argument, argv[0]. Other arguments may be passed from the system.

Example

#include <iostream>

int main(int argc, char **argv){
  std::cout << "Number of arguments: " << argc << std::endl;
  for(size_t i = 0; i < argc; i++)
    std::cout << "  Argument " << i << " = '" << argv[i] << "'" << std::endl;
}

Note:
size_t is the return type of sizeof function. size_t is a typedef for some unsigned type and is often defined as unsigned int or unsigned long but not always.

If the program above is compiled into the executable arguments and executed from the command line like this in *nix:

$ ./arguments I love chocolate cake

Or in Command Prompt in Windows or MS-DOS:

C:\>arguments I love chocolate cake

It will output the following (but note that argument 0 may not be quite the same as this—it might include a full path, or it might include the program name only, or it might include a relative path, or it might even be empty):

Number of arguments: 5
  Argument 0 = './arguments'
  Argument 1 = 'I'
  Argument 2 = 'love'
  Argument 3 = 'chocolate'
  Argument 4 = 'cake'

You can see that the command line arguments of the program are stored into the argv array, and that argc contains the length of that array. This allows you to change the behavior of a program based on the command line arguments passed to it.

Note:
argv is a (pointer to the first element of an) array of strings. As such, it can be written as char **argv or as char *argv[]. However, char argv[][] is not allowed. Read up on C++ arrays for the exact reasons for this.

Also, argc and argv are the two most common names for the two arguments given to the main function. You can think them to stand for "arguments count" and "arguments variables" respectively. They can, however, be changed if you'd like. The following code is just as legal:

int main(int foo, char **bar){
  // code
}

However, any other programmer that sees your code might get mad at you if you code like that.

From the example above, we can also see that C++ do not really care about what the variables' names are (of course, you cannot use reserved words as names) but their types.

Pointers to functions

edit

The pointers we have looked at so far have all been data pointers, pointers to functions (more often called function pointers) are very similar and share the same characteristics of other pointers but in place of pointing to a variable they point to functions. Creating an extra level of indirection, as a way to use the functional programming paradigm in C++, since it facilitates calling functions which are determined at runtime from the same piece of code. They allow passing a function around as parameter or return value in another function.

Using function pointers has exactly the same overhead as any other function call plus the additional pointer indirection and since the function to call is determined only at runtime, the compiler will typically not inline the function call as it could do anywhere else. Because of this characteristics, using function pointers may add up to be significantly slower than using regular function calls, and be avoided as a way to gain performance.

Note:
Function pointers are mostly used in C, C++ also permits another constructs to enable functional programming that are called functors (class type functors and template type functors) that have some advantages over function pointers.

To declare a pointer to a function naively, the name of the pointer must be parenthesized, otherwise a function returning a pointer will be declared. You also have to declare the function's return type and its parameters. These must be exact!

Consider:

int (*ptof)(int arg);

The function to be referenced must obviously have the same return type and the same parameter type as that of the pointer to function. The address of the function can be assigned just by using its name, optionally prefixed with the address-of operator &. Calling the function can be done by using either ptof(<value>) or (*ptof)(<value>).

So:

int (*ptof)(int arg);
int func(int arg){
    //function body
}
ptof = &func; // get a pointer to func
ptof = func;  // same effect as ptof = &func
(*ptof)(5);   // calls func
ptof(5);      // same thing.

A function returning a float can't be pointed to by a pointer returning a double. If two names are identical (such as int and signed, or a typedef name), then the conversion is allowed. Otherwise, they must be entirely the same. You define the pointer by grouping the * with the variable name as you would any other pointer. The problem is that it might get interpreted as a return type instead.

It is often clearer to use a typedef for function pointer types; this also provides a place to give a meaningful name to the function pointer's type:

typedef int (*int_to_int_function)(int);
int_to_int_function ptof;
int *func (int);   // WRONG: Declares a function taking an int returning pointer-to-int.
int (*func) (int); // RIGHT: Defines a pointer to a function taking an int returning int.

To help reduce confusion, it is popular to typedef either the function type or the pointer type:

typedef int ifunc (int);    // now "ifunc" means "function taking an int returning int"
typedef int (*pfunc) (int); // now "pfunc" means "pointer to function taking an int returning int"

If you typedef the function type, you can declare, but not define, functions with that type. If you typdef the pointer type, you cannot either declare or define functions with that type. Which to use is a matter of style (although the pointer is more popular).

To assign a pointer to a function, you simply assign it to the function name. The & operator is optional (it's not ambiguous). The compiler will automatically select an overloaded version of the function appropriate to the pointer, if one exists:

int f (int, int);
int f (int, double);
int g (int, int = 4);
double h (int);
int i (int);

int (*p) (int) = &g; // ERROR: The default parameter needs to be included in the pointer type.
p = &h;              // ERROR: The return type needs to match exactly.
p = &i;              // Correct.
p = i;               // Also correct.

int (*p2) (int, double);
p2 = f;              // Correct: The compiler automatically picks "int f (int, double)".

Using a pointer to a function is even simpler - you simply call it like you would a function. You are allowed to dereference it using the * operator, but you don't have to:

#include <iostream>

int f (int i) { return 2 * i; }

int main ()
 {
  int (*g) (int) = f;
  std::cout<<"g(4) is "<<g(4)<<std::endl;    // Will output "g(4) is 8"
  std::cout<<"(*g)(5) is "<<g(5)<<std::endl; // Will output "g(5) is 10"
  return 0;
 }

Callback

edit

In computer programming, a callback is executable code that is passed as an argument to other code. It allows a lower-level abstraction layer to call a function defined in a higher-level layer. A callback is often back on the level of the original caller.

 

Usually, the higher-level code starts by calling a function within the lower-level code, passing to it a pointer or handle to another function. While the lower-level function executes, it may call the passed-in function any number of times to perform some subtask. In another scenario, the lower-level function registers the passed-in function as a handler that is to be called asynchronously by the lower-level at a later time in reaction to something.

A callback can be used as a simpler alternative to polymorphism and generic programming, in that the exact behavior of a function can be dynamically determined by passing different (yet compatible) function pointers or handles to the lower-level function. This can be a very powerful technique for code reuse. In another common scenario, the callback is first registered and later called asynchronously.

 
 

To do:
Add missing, redirect links info and add examples...


Overloading

edit

Function overloading is the use of a single name for several different functions in the same scope. Multiple functions that share the same name must be differentiated by using another set of parameters for every such function. The functions can be different in the number of parameters they expect, or their parameters can differ in type. This way, the compiler can figure out the exact function to call by looking at the arguments the caller supplied. This is called overload resolution, and is quite complex.

// Overloading Example

// (1)
double geometric_mean( int, int );
 
// (2)
double geometric_mean( double, double );
 
// (3)
double geometric_mean( double, double, double );
 
// ...
 
// Will call (1):
geometric_mean( 10, 25 );
// Will call (2):
geometric_mean( 22.1, 421.77 );
// Will call (3):
geometric_mean( 11.1, 0.4, 2.224 );

Under some circumstances, a call can be ambiguous, because two or more functions match with the supplied arguments equally well.

Example, supposing the declaration of geometric_mean above:

// This is an error, because (1) could be called and the second
// argument casted to an int, and (2) could be called with the first
// argument casted to a double. None of the two functions is
// unambiguously a better match.
geometric_mean(7, 13.21);
// This will call (3) too, despite its last argument being an int,
// Because (3) is the only function which can be called with 3
// arguments
geometric_mean(1.1, 2.2, 3);

Template and non-template can be overloaded. A non-template function takes precedence over a template, if both forms of the function match the supplied arguments equally well.

Note that you can overload many operators in C++ too.

Overloading resolution

edit

Please beware that overload resolution in C++ is one of the most complicated parts of the language. This is probably unavoidable in any case with automatic template instantiation, user defined implicit conversions, built-in implicit conversation and more as language features. So do not despair if you do not understand this at first go. It is really quite natural, once you have the ideas, but written down it seems extremely complicated.


 

To do:
*This section does not cover the selection of constructors because, well, that's even worse. Namespaces are also not considered below.

  • Feel free to add the missing information, possibly as another chapter.


The easiest way to understand overloading is to imagine that the compiler first finds every function which might possibly be called, using any legal conversions and template instantiations. The compiler then selects the best match, if any, from this set. Specifically, the set is constructed like this:

  • All functions with matching name, including function templates, are put into the set. Return types and visibility are not considered. Templates are added with as closely matching parameters as possible. Member functions are considered functions with the first parameter being a pointer-to-class-type.
  • Conversion functions are added as so-called surrogate functions, with two parameters, the first being the class type and the second the return type.
  • All functions that do not match the number of parameters, even after considering defaulted parameters and ellipses, are removed from the set.
  • For each function, each argument is considered to see if a legal conversion sequence exists to convert the caller's argument to the function's parameters. If no such conversion sequence can be found, the function is removed from the set.

The legal conversions are detailed below, but in short a legal conversion is any number of built-in (like int to float) conversions combined with at most one user defined conversion. The last part is critical to understand if you are writing replacements to built-in types, such as smart pointers. User defined conversions are described above, but to summarize it is

  1. implicit conversion operators like operator short toShort();
  2. One argument constructors (If a constructor has all but one parameter defaulted, it is considered one-argument)

The overloading resolution works by attempting to establish the best matching function.

Easy conversions are preferred

Looking at one parameter, the preferred conversion is roughly based on scope of the conversion. Specifically, the conversions are preferred in this order, with most-preferred highest:

  1. No conversion, adding one or more const, adding reference, convert array to pointer to first member
    1. const are preferred for rvalues (roughly constants) while non-const are preferred for lvalues (roughly assignables)
  2. Conversion from short integral types (bool, char, short) to int, and float to double.
  3. Built-in conversions, such as between int and double and pointer type conversion. Pointer conversion are ranked as
    1. Base to derived (pointers) or derived to base (for pointers-to-members), with most-derived preferred
    2. Conversion to void*
    3. Conversion to bool
  4. User-defined conversions, see above.
  5. Match with ellipses. (As an aside, this is rather useful knowledge for template meta programming)

The best match is now determined according to the following rules:

  • A function is only a better match if all parameters match at least as well

In short, the function must be better in every respect --- if one parameter matches better and another worse, neither function is considered a better match. If no function in the set is a better match than both, the call is ambiguous (i.e., it fails) Example:

void foo(void*, bool);
void foo(int*, int);
 
int main() {
 int a;
 foo(&a, true); // ambiguous 
}
  • Non-template should be preferred over template

If all else is equal between two functions, but one is a template and the other not, the non-template is preferred. This seldom causes surprises.

  • Most-specialized template is preferred

When all else is equal between two template function, but one is more specialized than the other, the most specialized version is preferred. Example:

template<typename T> void foo(T);  //1
template<typename T> void foo(T*); //2
 
int main() {
 int a;
 foo(&a); // Calls 2, since 2 is more specialized.
}

Which template is more specialized is an entire chapter unto itself.

  • Return types are ignored

This rule is mentioned above, but it bears repeating: Return types are never part of overload resolutions, even if the function selected has a return type that will cause the compilation to fail. Example:

void foo(int);
int foo(float);
 
int main() { 
 // This will fail since foo(int) is best match, and void cannot be converted to int.
 return foo(5); 
}
  • The selected function may not be accessible

If the selected best function is not accessible (e.g., it is a private function and the call it not from a member or friend of its class), the call fails.