ROSE Compiler Framework/Inliner
The ROSE Inliner inlines functions at function callsites.
Background
editAn inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. Instead of transferring control to and from the function code segment, a modified copy of the function body may be substituted directly for the function call. In this way, the performance overhead of a function call is avoided. The inline specifier is only a suggestion to the compiler that an inline expansion can be performed; the compiler is free to ignore the suggestion.
User Instructions
editYou must enable EDG 5.0 to inline C++11 code
- --enable-edg_version=5.0
install the tool
editconfigure and build ROSE as instructed at https://github.com/rose-compiler/rose/wiki
inlineEverything -c [options] input.c
Usage
editThis is a program transformation tool to inline function calls in your C/C++ or Fortran code.
Usage: inlineEverything -c [options] input.c
The optional options include:
- -skip-postprocessing: Skip postprocessing which cleanups code
- -process-headers: Process calls within header files
- -verbose: Printout debugging information
- -limit N: Inline up to N functions, then stop
- -main-only: Inline only functions reachable from main()
Source code
editAPI and implementation:
// main API bool doInline(SgFunctionCallExp*, bool)
The tool's source code
- https://github.com/rose-compiler/rose/tree/develop/tests/nonsmoke/functional/roseTests/astInliningTests/inlineEverything.C
- command line processing is handled in this source file
Algorithm
edit216 // Main inliner code. Accepts a function call as a parameter, and inlines 217 // only that single function call. Returns true if it succeeded, and false 218 // otherwise. The function call must be to a named function, static member 219 // function, or non-virtual non-static member function, and the function 220 // must be known (not through a function pointer or member function 221 // pointer). Also, the body of the function must already be visible. 222 // Recursive procedures are handled properly (when allowRecursion is set), by 223 // inlining one copy of the procedure into itself. Any other restrictions on 224 // what can be inlined are bugs in the inliner code. 225 bool 226 doInline(SgFunctionCallExp* funcall, bool allowRecursion)
Major steps
edit- eligibility check: skip things which cannot be inlined
- if a function call is used as an expression operand: e.g. a = func1() + func2();
- generate a temp variable to obtain the returned value: e.g. temp = func1();
- replace the function call expression with the temp variable. e.g. a = temp + temp;
- a slight optimization: if the function call is the only expression operand: e.g. a= func1(). No temp variable is needed (a can be directly used without another temp variable serving as intermediate variable.)
- obtain the list of actual argument
- make a copy of the body of the function to be inlined
- rename labels in an inlined function definition. goto statements to them will be updated.
- in the function body
- create local variables , one per formal argument, initialize each with the actual argument
- build a paramMap: mapping formal arguments (SgInitializedName) to new local variables (SgVariableSymbol)
- this pointer is handled similarly: create a local variable , initialized with the caller's this pointer
- replace variable references in the body with actual arguments // ReplaceParameterUseVisitor(paramMap).traverse(funbody_copy, postorder);
- insert a label to indicate the end of the inlined function body // rose_inline_end__
Limitations of this algorithm is not very clean:
- It generates new local variables and a label.
Eligibility check
editWhat can be inlined
- a named function,
- static member function, or
- qualified name does not start with "::std:: " // skip std:: functions
- non-virtual non-static member function, // skip virtual functions, static member functions cannot access this->data (non-static data). That is why we check non-static for this ptr case.
- the function must be known (not through a function pointer or member function pointer). // empty function reference expression
- the body of the function must already be visible in the current AST. // skip functions with empty body
postprocessing
editinlineEverything.C has a clean up step to make the outlined code better
- cleanupInlinedCode(sageProject);
- remove unused labels: SageInterface::removeUnusedLabels(top);
- remove jumps to next statement: SageInterface::removeJumpsToNextStatement(top);
- simpleCopyAndConstantPropagation() // In code with declarations such as "int foo = bar", where foo and bar are not modified, replace "foo" with "bar" and remove the declaration
- remove null statements: RemoveNullStatementsVisitor().traverse(top, postorder);
- move declaration to first use: MoveDeclarationsToFirstUseVisitor().traverse(top, postorder); e.g. int x__2 =7; w= x_2 +3; ==> w=7+3;
- doSubexpressionExpansionSmart () // Replaces all uses of a variable by its initialing expression. Requires that initname has an assign initializer Replaces all uses of initname in initname's scope by copy of its initializer expression. Then removes initname.
- changeAllMembersToPublic(sageProject);
This step does lots of things and can easily trigger some bugs.
- Even wrose, the cleanup step operates on the entire AST, including C++ headers.
// Post-inline AST normalizations // DQ (6/12/2015): These functions first renames all variable (a bit heavy handed for my tastes) // and then (second) removes the blocks that are otherwise added to support the inlining. The removal // of the blocks is the motivation for renaming the variables, but the variable renaming is // done evarywhere instead of just where the functions are inlined. I think the addition of // the blocks is a better solution than the overly agressive renaming of variables in the whole // program. So the best solution is to comment out both of these functions. All test codes // pass (including the token-based unparsing tests). // renameVariables(sageProject); // flattenBlocks(sageProject); cleanupInlinedCode(sageProject); // In code with declarations such as "int foo = bar", where foo and bar are // not modified, replace "foo" with "bar" and remove the declaration void simpleCopyAndConstantPropagation(SgNode* top) { FindReferenceVariablesVisitor().traverse(top, preorder); FindCopiesVisitor().traverse(top, preorder); FindUsedDeclarationsVisitor vis; vis.traverse(top, preorder); RemoveUnusedDeclarationsVisitor(vis.used_decls, set<SgFunctionDeclaration*>()).traverse(top, postorder); }
Testing
editTest directory with an example translator and test input files
By looking into Makefile.am, the example translator's source code will generate an executable named "inlineEverything" in your buildtree.
translator: inlineEverything
editThis is the tool you can try to inline your sample code.
- inlineEverything
The same Makefile.am's make check rules contain sample command lines to use the tool.
To test one individual input file (such as template_functions.C), type
- make inlineEverything_template_functions.C.passed // TODO: udpate to new ways to trigger the tests
Command line options
editinlineEverything --help
---------------------Tool-Specific Help----------------------------------- This is a program transformation tool to inline function calls in your C/C++ or Fortran code. Usage: inlineEverything -c [options] input.c The optional options include: -skip-postprocessing: Skip postprocessing which cleanups code -process-headers: Process calls within header files -verbose: Printout debugging information
postprocessing
edit--------------input---------------- bash-4.2$ cat specimen25_1.C template<typename T> void swap(T& x, T& y) { T tmp = x; x = y; y = tmp; } int foo (int a, int b) { swap(a,b); } int main() { } ----- command line ------------- bash-4.2$ inlineEverything -c specimen26_1.C -----------output: with postprocessing (cleanup) -------------- bash-4.2$ cat rose_specimen25_1.C template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int tmp = a; a = b; b = tmp; } } int main() { } // output without postprocessing: cleanup template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int &x__2 = a; int &y__3 = b; int tmp = x__2; x__2 = y__3; y__3 = tmp; rose_inline_end__4: ; } } int main() { }
Correctness checking
editmake check only check if the number of inlined functions is the same as expected.
# Note: must use the name convention of specimenXX_N.C , in which N is the number of function calls inlined. # The specimens are named so that the number between the "_" and next "." is the number of function calls that # we expect this specimen to inline. inlineEverything_specimens = \ specimen01_1.C .. inlineEverything_test_targets = $(addprefix inlineEverything_, $(addsuffix .passed, $(inlineEverything_specimens))) TEST_TARGETS += $(inlineEverything_test_targets) $(inlineEverything_test_targets): inlineEverything_%.passed: % inlineEverything inlineEverything.conf @$(RTH_RUN) \ TITLE="inlineEverything $< [$@]" \ SPECIMEN="$(abspath $<)" \ NINLINE="$$(echo $(notdir $<) |sed --regexp-extended 's/specimen[0-9]+_([0-9]+).*/\1/')" \ TRANSLATOR="$$(pwd)/inlineEverything" \ $(srcdir)/inlineEverything.conf $@ cat inlineEverything.conf # Test configuration file (see "scripts/rth_run.pl --help" for details) # Tests the inliner # Run the tests in subdirectories for ease of cleanup. subdir = yes # Run the test and then make sure the output contains a certain string cmd = ${VALGRIND} ${TRANSLATOR} -rose:verbose 0 ${SPECIMEN} -o a.out |tee ${TEMP_FILE_0} cmd = grep "Test inlined ${NINLINE} function" ${TEMP_FILE_0} cmd = cat -n rose_* cmd = ./a.out # Extra stuff that might be useful to specify in the makefile title = ${TITLE} disabled = ${DISABLED} timeout = ${TIMEOUT}
examples
editWe use a set of increasingly more complex examples to explain the inlining algorithm used.
More example input and output files are available at:
naked call: no parameters in nor return output
editextern int x; void incrementX() { x++; } int main() { incrementX(); return x; } //----------output, without postprocessing -------- extern int x; void incrementX() { x++; } int main() { // the function body is copied here { x++; rose_inline_end__2: // a label for the end of a function is generated. ; } return x; } //-----------output, with postprocessing for clean up // unused label and empty statement are removed extern int x; void incrementX() { x++; } int main() { // the function body is copied here { x++; } return x; }
function with a return
edit// a function with a return extern int x; int incrementX() { x++; return x; } int main() { incrementX(); return x; } //---------- output without postprocessing // a function with a return extern int x; int incrementX() { x++; return x; } int main() { { x++; { // a return statement is translated into a block, go to the exit point in the end x; goto rose_inline_end__2; } rose_inline_end__2: // a label for the end of the function: the exit point ; } return x; } //-------- with postprocessing, the code look the same // a function with a return extern int x; int incrementX() { x++; return x; } int main() { { x++; { goto rose_inline_end__2; } rose_inline_end__2: ; } return x; }
function calls as two expressions
edit// input code ----------------------- int foo(int i) { return 5+i; } int main(int, char**) { int w; w = foo(1)+ foo(2); return w; } //--------------after inlining----------------- // You can see that a temparory variable is used to capture the returned value of a function call. // Then the temp variable is used to replace the original function call expression int foo(int i) { return 5 + i; } int main(int ,char **) { int w; int rose_temp__4; { int i__2 = 1; { rose_temp__4 = 5 + i__2; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { int i__6 = 2; { rose_temp__8 = 5 + i__6; goto rose_inline_end__7; } rose_inline_end__7: ; } w = rose_temp__4 + rose_temp__8; return w; } //----- postprocessing does not simplify the code any further int foo(int i) { return 5 + i; } int main(int ,char **) { int rose_temp__4; { { rose_temp__4 = 5 + 1; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { { rose_temp__8 = 5 + 2; goto rose_inline_end__7; } rose_inline_end__7: ; } int w = rose_temp__4 + rose_temp__8; return w; }
function call as a single expression
editAn optimized transformation:
- not blindly generate a temp variable to capture the value of a function call.
- instead directly reuse the original declaration of the lhs variable inside the function body
int foo(int i) { return 5+i; } int main(int, char**) { int w; w = foo(1); return w; } //-------------after inlining ---------- int foo(int i) { return 5 + i; } int main(int ,char **) { int w; { int i__2 = 1; { w = 5 + i__2; goto rose_inline_end__3; } rose_inline_end__3: ; } return w; } //postprocessing does not simplify the code further. int foo(int i) { return 5 + i; } int main(int ,char **) { int w; { { w = 5 + 1; goto rose_inline_end__3; } rose_inline_end__3: ; } return w; }
3 operand operations
editthe code is normalized.
#include <stdlib.h> int foo() { exit (1); return 0; } int main(int, char**) { int w, x = 7; w = x == 8 ? foo() : 0; return w; } //----------- after inlining --------------- #include <stdlib.h> int foo() { exit(1); return 0; } int main(int ,char **) { int w; int x = 7; if (x == 8) { int rose_temp__4; { exit(1); { rose_temp__4 = 0; goto rose_inline_end__2; } rose_inline_end__2: ; } w = rose_temp__4; } else { w = 0; } return w; }
data member access function
edit#include <vector> typedef int Index_t ; struct Domain { public: // non-reference type Index_t numNode() { return m_numNode ; } void AllocateNodeElemIndexes() { Index_t numNode = this->numNode() ; } #if 0 // the best inline result should look like the following void AllocateNodeElemIndexes_inlined() { Index_t numNode = m_numNode; // call site 1 inlined } #endif private: Index_t m_numNode ; } domain; //----------------------------after inlining ---------------- #include <vector> typedef int Index_t; struct Domain { // non-reference type inline Index_t numNode() { return (this) -> m_numNode; } inline void AllocateNodeElemIndexes() { //x. split declaration + initializer into two parts // a temporary variable to transfer value of initializer Index_t rose_temp__3; //x. a new code block to embed the function body { struct Domain *this__1 = this__1; { rose_temp__3 = this__1 -> m_numNode; //x. goto the label after function call goto rose_inline_end__2; } //x. label after the function call rose_inline_end__2: ; } Index_t numNode = rose_temp__3; } Index_t m_numNode; } domain;
C++ template functions
edit--------------input---------------- bash-4.2$ cat specimen25_1.C template<typename T> void swap(T& x, T& y) { T tmp = x; x = y; y = tmp; } int foo (int a, int b) { swap(a,b); } int main() { } ----- command line ------------- bash-4.2$ inlineEverything -c -skip-postprocessing specimen26_1.C -----------output: with postprocessing (cleanup) -------------- // output without postprocessing: cleanup template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int &x__2 = a; // local variables for each formal arguments, initialized with actual arguments int &y__3 = b; int tmp = x__2; // variable references are replace with the local variables x__2 = y__3; y__3 = tmp; rose_inline_end__4: // a label to indicate the end of the outlined function body. ; } } int main() { }
multiple-level function calls
edit//------------input int foo(int x) { return x + 3; } int bar(int y) { return foo(y) + foo(2); } int main(int, char**) { int w; w = bar(1); return 0; } //--------------- output, no postprocessing ------------ int foo (int x) { return x + 3; } int bar (int y) { int rose_temp__4; { int x__2 = y; { rose_temp__4 = x__2 + 3; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { int x__6 = 2; { rose_temp__8 = x__6 + 3; goto rose_inline_end__7; } rose_inline_end__7: ; } return rose_temp__4 + rose_temp__8; } int main (int, char **) { int w; { int y__10 = 1; int rose_temp__4; { int x__2 = y__10; { rose_temp__4 = x__2 + 3; goto rose_inline_end__3__1; } rose_inline_end__3__1: ; } int rose_temp__8; { int x__6 = 2; { rose_temp__8 = x__6 + 3; goto rose_inline_end__7__2; } rose_inline_end__7__2: ; } { w = rose_temp__4 + rose_temp__8; goto rose_inline_end__11; } rose_inline_end__11: ; } return 0; }
Tutorial
editOfficial documentation about how to call the inlining API is:
- Chapter 36 "Calling the Inliner of ROSE" tutorial: http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
Troubleshooting
editTEST inlineEverything ../../../../../../sourcetree/tests/nonsmoke/functional/roseTests/astInliningTests/template_functions.C [inlineEverything_template_functions.C.passed]
inlineEverything_template_functions.C [out]: Note: C++11 input files to ROSE are NOT supported using EDG 4.9 configuration with GNU compilers 4.9 and greater (configure ROSE using EDG 4.12)