The Compiler


A compiler is a program that translates a computer program written in one computer language (the source code) into an equivalent program written in the computer's native machine language. This process of translation, that includes several distinct steps is called compilation. Since the compiler is a program, itself written in a computer language, the situation may seem a paradox akin to the chicken and egg dilemma. A compiler may not be created with the resulting compilable language but with a previous available language or even in machine code.



The compilation output of a compiler is the result from translating or compiling a program. The most important part of the output is saved to a file called an object file, it consists of the transformation of source files into object files.

Some files may be created/needed for a successful compilation. That data is not part of the C++ language and may result from the compilation of external code (an example would be a library). This could depend on the specific compiler you use (MS Visual Studio for example adds several extra files to a project), in which case you should check the documentation. The data can also be a part of a specific framework that needs to be accessed. Beware that some of these constructs may limit the portability of the code.

The instructions of this compiled program can then be run (executed) by the computer if the object file is in an executable format. However, there are additional steps that are required for a compilation: preprocessing and linking.



Defines the time and operations performed by a compiler (i.e., compile-time operations) during a build (creation) of a program (executable or not). Most of the uses of "static" in the C++ language are directly related to compile-time information.

The operations performed at compile time usually include lexical analysis, syntax analysis, various kinds of semantic analysis (e.g., type checks, some of the type casts, and instantiation of template) and code generation.

The definition of a programming language will specify compile time requirements that source code must meet to be successfully compiled.

Compile time occurs before link time (when the output of one or more compiled files are joined together) and runtime (when a program is executed). In some programming languages it may be necessary for some compilation and linking to occur at runtime.


Run-time, or execution time, starts at the moment the program starts to execute and end as it exits. At this stage the compiler is irrelevant and has no control. This is the most important location in regards to optimizations (a program will only compile once but run many times) and debugging (tracing and interaction will only be possible at this stage). But it is also in run-time that some of the type casting may occur and that Run-Time Type Information (RTTI) has relevance. The concept of runtime will be mentioned again when relevant.

Lexical analysis

This is alternatively known as scanning or tokenisation. It happens before syntax analysis and converts the code into tokens, which are the parts of the code that the program will actually use. The source code as expressed as characters (arranged on lines) into a sequence of special tokens for each reserved keyword, and tokens for data types and identifiers and values. The lexical analyzer is the part of the compiler which removes whitespace and other non compilable characters from the source code. It uses whitespace to separate different tokens, and ignores the whitespace.

To give a simple illustration of the process:

int main()
    std::cout << "hello world" << std::endl;
    return 0;

Depending on the lexical rules used it might be tokenized as:

1 = string "int"
2 = string "main"
3 = opening parenthesis
4 = closing parenthesis
5 = opening brace
6 = string "std"
7 = namespace operator
8 = string "cout"
9 = << operator
10 = string ""hello world""
11 = string "endl"
12 = semicolon
13 = string "return"
14 = number 0
15 = closing brace

And so for this program the lexical analyzer might send something like:

1 2 3 4 5 6 7 8 9 10 9 6 7 11 12 13 14 12 15

To the syntactical analyzer, which is talked about next, to be parsed. It is easier for the syntactical analyzer to apply the rules of the language when it can work with numerical values and can distinguish between language syntax (such as the semicolon) and everything else, and knows what data type each thing has.

Syntax analysis

This step (also called sometimes syntax checking) ensures that the code is valid and will sequence into an executable program. The syntactical analyzer applies rules to the code, checking to make sure that each opening brace has a corresponding closing brace, and that each declaration has a type, and that the type exists, and that.... syntax analysis is more complicated than lexical analysis =).

As an example:

int main()
    std::cout << "hello world" << std::endl;
    return 0;
  • The syntax analyzer would first look at the string "int", check it against defined keywords, and find that it is a type for integers.
  • The analyzer would then look at the next token as an identifier, and check to make sure that it has used a valid identifier name.
  • It would then look at the next token. Because it is an opening parenthesis it will treat "main" as a function, instead of a declaration of a variable if it found a semicolon or the initialization of an integer variable if it found an equals sign.
  • After the opening parenthesis it would find a closing parenthesis, meaning that the function has 0 parameters.
  • Then it would look at the next token and see it was an opening brace, so it would think that this was the implementation of the function main, instead of a declaration of main if the next token had been a semicolon, even though you can not declare main in c++. It would probably create a counter also to keep track of the level of the statement blocks to make sure the braces were in pairs. *After that it would look at the next token, and probably not do anything with it, but then it would see the :: operator, and check that "std" was a valid namespace.
  • Then it would see the next token "cout" as the name of an identifier in the namespace "std", and see that it was a template.
  • The analyzer would see the << operator next, and so would check that the << operator could be used with cout, and also that the next token could be used with the << operator.
  • The same thing would happen with the next token after the ""hello world"" token. Then it would get to the "std" token again, look past it to see the :: operator token and check that the namespace existed again, then check to see if "endl" was in the namespace.
  • Then it would see the semicolon and so it would see that as the end of the statement.
  • Next it would see the keyword return, and then expect an integer value as the next token because main returns an integer, and it would find 0, which is an integer.
  • Then the next symbol is a semicolon so that is the end of the statement.
  • The next token is a closing brace so that is the end of the function. And there are no more tokens, so if the syntax analyzer did not find any errors with the code, it would send the tokens to the compiler so that the program could be converted to machine language.

This is a simple view of syntax analysis, and real syntax analyzers do not really work this way, but the idea is the same.

Here are some keywords which the syntax analyzer will look for to make sure you are not using any of these as identifier names, or to know what type you are defining your variables as or what function you are using which is included in the C++ language.

Compile speed


There are several factors that dictate how fast a compilation proceeds, like:

  • Hardware
    • Resources (Slow CPU, low memory and even a slow HDD can have an influence)
  • Software
    • The compiler itself, new is always better, but may depend on how portable you want the project to be.
    • The design selected for the program (structure of object dependencies, includes) will also factor in.

Experience tells that most likely if you are suffering from slow compile times, the program you are trying to compile is poorly designed, take the time to structure your own code to minimize re-compilation after changes. Large projects will always compile slower. Use pre-compiled headers and external header guards. We will discuss ways to reduce compile time in the Optimization Section of this book.

Where to get a compiler


When you select your compiler you must take in consideration your system OS, your personal preferences and the documentation that you can get on using it.

In case you do not have, want or need a compiler installed on you machine, you can use a WEB free compiler available at (or but you will have to change the code not to require interactive input). You can always get one locally if you need it.

There are many compilers and even more IDEs available, some are free and open source. IDEs will often include in the installation the required compiler (being GCC the most common).

One of most mature and compatible C++ compiler is on GCC, also known as the GNU Compiler Collection. It is a free set of compilers developed by the Free Software Foundation, with Richard Stallman as one of the main architects.

There are many different pre-compiled GCC binaries on the Internet; some popular choices are listed below (with detailed steps for installation). You can easily find information on the GCC website on how to do it under another OS.

It is often common that the implementation language of a compiler to be C (since it is normally first the system language above assembly that new systems implement). GCC did in the end of May 2005, get the green light to start moving the core code-base to C++. Considering that this is the most commonly used compiler and an open source implementation, it was an extremely positive step for the compiler and the language in general.

IDE (Integrated development environment)

Graphical Vim under GTK2

Integrated development environment is a software development system, that often includes an editor, compiler and debugger in an integrated package that is distributed together. Some IDEs will require the user to make the integration of the components themselves, and others will refer as the IDE to the set of separated tools they use for programming.

A good IDE is one that permits the programmer to use it to abstract and accelerate some of the more common tasks and at the same time provide some help in reading and managing the code. Except for the compiler the C++ Standard has no control over the different implementations. Most IDEs are visually oriented, especially the new ones, they will offer graphical debuggers and other visual aids, but some people will still prefer the visual simplicity offered by potent text editors like Vim or Emacs.

When selecting an IDE, remember that you are also investing time to become proficient in its use. Completeness, stability and portability across OSs will be important.

For Microsoft Windows, you have also the Microsoft Visual Studio Community (latest version 2019), currently freely available and includes most features. It includes a C++ compiler that can be used from the command line or the supplied IDE.

In the book Appendix B:External References you will find references to other freely available compilers and IDEs you can use.

On Windows


  1. Go to and click on the "Install Cygwin Now" button in the upper right corner of the page.
  2. Click "run" in the window that pops up, and click "next" several times, accepting all the default settings.
  3. Choose any of the Download sites ("", etc.) when that window comes up; press "next" and the Cygwin installer should start downloading.
  4. When the "Select Packages" window appears, scroll down to the heading "Devel" and click on the "+" by it. In the list of packages that now displays, scroll down and find the "gcc-c++" package; this is the compiler. Click once on the word "Skip", and it should change to some number like "3.4" etc. (the version number), and an "X" will appear next to "gcc-core" and several other required packages that will now be downloaded.
  5. Click "next" and the compiler as well as the Cygwin tools should start downloading; this could take a while. While you are waiting, go to and download that free programmer's editor; it is powerful yet easy to use for beginners.
  6. Once the Cygwin downloads are finished and you have clicked "next", etc. to finish the installation, double-click the Cygwin icon on your desktop to begin the Cygwin "command prompt". Your home directory will automatically be set up in the Cygwin folder, which now should be at "C:\cygwin" (the Cygwin folder is in some ways like a small Unix/Linux computer on your Windows machine -- not technically of course, but it may be helpful to think of it that way).
  7. Type "g++" at the Cygwin prompt and press "enter"; if "g++: no input files" or something like it appears you have succeeded and now have the gcc C++ compiler on your computer (and congratulations -- you have also just received your first error message!).

MinGW + DevCpp-IDE

  1. Go to ,(Severly outdated last update 2005)( (Updated Branch project) choose the version you want (eventually scrolling down), and click on the appropriate download link! For the most current version, you will be redirected to
  2. Scroll down to read the license and then to the download links. Download a version with Mingw/GCC. It is much easier than to do this assembling yourself. With a very short delay (only some days) you will always get the most current version of MinGW packaged with the devcpp IDE. It is absolutely the same as with manual download of the required modules.
  3. You get an executable that can be executed at user level under any WinNT version. If you want it to be setup for all users, however, you need admin rights. It will install devcpp and mingw in folders of your wish.
  4. Start the IDE and experience your first project!
    You will find something mostly similar to MSVC, including menu and button placement. Of course, many things are somewhat different if you were familiar with the former, but it is as simple as a handful of clicks to let your first program run.


  • Go to Delorie Software and download the GNU C++ compiler and other necessary tools. The site provides a Zip Picker in order to help identify which files you need, which is available from the main page.
  • Use unzip32 or other extraction utility to place files into the directory of your choice (i.e. C:\DJGPP).
  • Set the environment variables to configure DJGPP for compilation, by either adding lines to autoexec.bat or a custom batch file:
  • If you are running MS-DOS or Windows 3.1, you need to add a few lines to config.sys if they are not already present:
    shell=c:\dos\ c:\dos /e:2048 /p

Note: The GNU C++ compiler under DJGPP is named gpp.

For Linux
  • For Gentoo, GCC C++ is part of the system core (since everything in Gentoo is compiled)
  • For Redhat, get a gcc-c++ RPM, e.g. using Rpmfind and then install (as root) using rpm -ivh gcc-c++-version-release.arch.rpm
  • For Fedora, install the GCC C++ compiler (as root) by using dnf install gcc-c++
  • For Mandrake, install the GCC C++ compiler (as root) by using urpmi gcc-c++
  • For Debian, install the GCC C++ compiler (as root) by using apt-get install g++
  • For Ubuntu, install the GCC C++ compiler by using sudo apt-get install g++
  • For openSUSE, install the GCC C++ compiler (as root) by using zypper in gcc-c++
  • If you cannot become root, get the tarball from [1] and follow the instructions in it to compile and install in your home directory.
For Mac OS X

Xcode (IDE for Apple's OSX and iOS) above v4.1 uses Clang [2], a free and open source alternative to the GCC compiler and largely compatible with it (taking even the same command line arguments). The IDE also has an older version of the GCC C++ compiler bundled. It can be invoked from the Terminal in the same way as Linux, but can also be compiled in one of XCode's projects.

Clang is not the only alternative or even the only free alternative to GCC. Some other possibilities are included in the External References section of the book. Clang has gained increased adoption as it permits better code optimization and internal indexing that enables support to more complex features in IDEs, like code completion, highlights and other modern commodities that programmers now increasingly rely on. Those are also possible on GCC but require building and extra tools making them slower. While GCC is the still the default standard as the free, open source C++ compiler, Clang seems to be gaining momentum. Binaries are available for Linux (Ubuntu), FreeBSD, OSX and Windows (mingw) and can be downloaded from


To do:
Section (a page) that covers why one should avoid compiler extensions if possible and serves to aggregate some of the useful information C++ programmers require to cope with them.