Template:GNU C Compiler Internals/Additional Representations 4 1

Additional representations of GCC 4.1.
Additional representations of GCC 4.1.

Figure 2 shows additional representations of GCC 4.1.

Because of the differences in languages, the format of the generated ASTs is slightly different for each language. The next step after AST generation is the unification step in which the AST tree is converted into a unified form called generic. After this, the middle end part of the compiler takes control. First, the tree is converted into another representation called GIMPLE. In this form, each expression contains no more than three operands, all control flow constructs are represented as combinations of conditional statements and goto operators, arguments of a function call can only be variables, etc. Figure 2 illustrates the differences between a tree in generic form and a tree in GIMPLE form. GIMPLE is a convenient representations for optimizing the source code.


After GIMPLE, the source code is converted into the static single assignment (SSA) representation. The central idea of this form is the fact that each variable is assigned to only once, but can be used at the right hand side of an expression many times. Every time the same variable of a tree in the GIMPLE form is reassigned, the compiler creates a new version of that variable and stores the new value into it. When the same variable is assigned to in both branches of a conditional expression, one needs to merge the two possible values of the variable into a single variable. This operation is denoted as PHI function in the SSA form.


The SSA form is also used for optimizations. GCC performs more than 20 different optimizations on SSA trees. After the SSA optimization pass, the tree is converted back to the GIMPLE form.