C++ Programming
Linker
editThe linker is a program that makes executable files. The linker resolves linkage issues, such as the use of symbols or identifiers which are defined in one translation unit and are needed from other translation units. Symbols or identifiers which are needed outside a single translation unit have external linkage. In short, the linker's job is to resolve references to undefined symbols by finding out which other object defines a symbol in question, and replacing placeholders with the symbol's address. Of course, the process is more complicated than this; but the basic ideas apply.
Linkers can take objects from a collection called a library. Depending on the library (system or language or external libraries) and options passed, they may only include its symbols that are referenced from other object files or libraries. Libraries for diverse purposes exist, and one or more system libraries are usually linked in by default. We will take a closer look into libraries on the Libraries Section of this book.
Linking
editThe process of connecting or combining object files produced by a compiler with the libraries necessary to make a working executable program (or a library) is called linking. Linkage refers to the way in which a program is built out of a number of translation units.
C++ programs can be compiled and linked with programs written in other languages, such as C, Fortran, assembly language, and Pascal.
- The appropriate compiler compiles each module separately. A C++ compiler compiles each ".cpp" file into a ".o" file, an assembler assembles each ".asm" file into a ".o" file, a Pascal compiler compiles each ".pas" file into a ".o" file, etc.
- The linker links all the ".o" files together in a separate step, creating the final executable file.
Linkage
editEvery function has either external or internal linkage.
A function with internal linkage is only visible inside one translation unit. When the compiler compiles a function with internal linkage, the compiler writes the machine code for that function at some address and puts that address in all calls to that function (which are all in that one translation unit), but strips out all mention of that function in the ".o" file. If there is some call to a function that apparently has internal linkage, but doesn't appear to be defined in this translation unit, the compiler can immediately tell the programmer about the problem (error). If there is some function with internal linkage that never gets called, the compiler can do "dead code elimination" and leave it out of the ".o" file.
The linker never hears about those functions with internal linkage, so it knows nothing about them.
A function declared with external linkage is visible inside several translation units. When a compiler compiles a call to that function in one translation unit, it does not have any idea where that function is, so it leaves a placeholder in all calls to that function, and instructions in the ".o" file to replace that placeholder with the address of a function with that name. If that function is never defined, the compiler can't possibly know that, so the programmer doesn't get a warning about the problem (error) until much later.
When a compiler compiles (the definition of) a function with external linkage (in some other translation unit), the compiler writes the machine code of that function at some address, and puts that address and the name of the function in the ".o" file where the linker can find it. The compiler assumes that the function will be called from some other translation unit (some other ".o" file), and must leave that function in this ".o" file, even if it ends up that the function is never called from any translation unit.
Most code conventions specify that header files contain only declarations, not definitions. Most code conventions specify that implementation files (".cpp" files) contain only definitions and local declarations, not external declarations.
This results in the "extern" keyword being used only in header files, never in implementation files. This results in internal linkage being indicated only in implementation files, never in header files. This results in the "static" keyword being used only in implementation files, never in header files, except when "static" is used inside a class definition inside a header file, where it indicates something other than internal linkage.
We discuss header files and implementation files in more detail later in the File Organization Section of the book.
Internal
edit
The static keyword can be used in four different ways:
- to create permanent storage for local variables in a function.
- to specify internal linkage.
- to declare member functions that act like non-member functions.
- to create a single copy of a data member.
- Internal linkage
When used on a free function, a global variable, or a global constant, it specifies internal linkage (as opposed to extern
, which specifies external linkage). Internal linkage limits access to the data or function to the current file.
Examples of use outside of any function or class:
static int apples = 15;
- defines a "static global" variable named apples, with initial value 15, only visible from this translation unit.
static int bananas;
- defines a "static global" variable named bananas, with initial value 0, only visible from this translation unit.
int g_fruit;
- defines a global variable named g_fruit, with initial value 0, visible from every translation unit. Such variables are often frowned on as poor style.
static const int muffins_per_pan=12;
- defines is a variable named muffins_per_pan, visible only in this translation unit. The static keyword is redundant here.
const int hours_per_day=24;
- defines a variable named hours_per_day, only visible in this translation unit. (This acts the same as ).
static const int hours_per_day=24;
static void f();
- declares that there is a function f taking no arguments and with no return value defined in this translation unit. Such a forward declaration is often used when defining mutually recursive functions.
static void f(){;}
- defines the function f() declared above. This function can only be called from other functions and members in this translation unit; it is invisible to other translation units.
External
editAll entities in the C++ Standard Library have external linkage.
The extern
keyword tells the compiler that a variable is defined in another source module (outside of the current scope). The linker then finds this actual declaration and sets up the extern
variable to point to the correct location. Variables described by extern
statements will not have any space allocated for them, as they should be properly defined elsewhere. If a variable is declared extern, and the linker finds no actual declaration of it, it will throw an "Unresolved external symbol" error.
Examples:
extern int i;
- declares that there is a variable named i of type int, defined somewhere in the program.
extern int j = 0;
- defines a variable j with external linkage; the
extern
keyword is redundant here.
extern void f();
- declares that there is a function f taking no arguments and with no return value defined somewhere in the program;
extern
is redundant, but sometimes considered good style.
extern void f() {;}
- defines the function f() declared above; again, the
extern
keyword is technically redundant here as external linkage is default.
extern const int k = 1;
- defines a constant int k with value 1 and external linkage; extern is required because const variables have internal linkage by default.
extern
statements are frequently used to allow data to span the scope of multiple files.
When applied to function declarations, the additional "C" or "C++" string literal will change name mangling when compiling under the opposite language. That is, extern "C" int plain_c_func(int param);
allows C++ code to execute a C library function plain_c_func.