ROSE Compiler Framework/Inliner

Background edit

An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. Instead of transferring control to and from the function code segment, a modified copy of the function body may be substituted directly for the function call. In this way, the performance overhead of a function call is avoided. The inline specifier is only a suggestion to the compiler that an inline expansion can be performed; the compiler is free to ignore the suggestion.

User Instructions edit

You must enable EDG 5.0 to inline C++11 code

--enable-edg_version=5.0

install the tool edit

configure and build ROSE as instructed at https://github.com/rose-compiler/rose/wiki

inlineEverything -c [options] input.c

Usage edit

This is a program transformation tool to inline function calls in your C/C++ or Fortran code.

Usage: inlineEverything -c [options] input.c

The optional options include:

-skip-postprocessing: Skip postprocessing which cleanups code
-process-headers: Process calls within header files
-verbose: Printout debugging information
-limit N: Inline up to N functions, then stop
-main-only: Inline only functions reachable from main()

Source code edit

API and implementation:

// main API
bool doInline(SgFunctionCallExp*, bool)

https://github.com/rose-compiler/rose/tree/develop/src/midend/programTransformation/astInlining

The tool's source code

https://github.com/rose-compiler/rose/tree/develop/tests/nonsmoke/functional/roseTests/astInliningTests/inlineEverything.C
- command line processing is handled in this source file

Algorithm edit


216 // Main inliner code.  Accepts a function call as a parameter, and inlines
217 // only that single function call.  Returns true if it succeeded, and false
218 // otherwise.  The function call must be to a named function, static member
219 // function, or non-virtual non-static member function, and the function
220 // must be known (not through a function pointer or member function
221 // pointer).  Also, the body of the function must already be visible.
222 // Recursive procedures are handled properly (when allowRecursion is set), by
223 // inlining one copy of the procedure into itself.  Any other restrictions on
224 // what can be inlined are bugs in the inliner code.
225 bool
226 doInline(SgFunctionCallExp* funcall, bool allowRecursion)

Major steps edit

eligibility check: skip things which cannot be inlined
if a function call is used as an expression operand: e.g. a = func1() + func2();
- generate a temp variable to obtain the returned value: e.g. temp = func1();
- replace the function call expression with the temp variable. e.g. a = temp + temp;
- a slight optimization: if the function call is the only expression operand: e.g. a= func1(). No temp variable is needed (a can be directly used without another temp variable serving as intermediate variable.)
obtain the list of actual argument
make a copy of the body of the function to be inlined
rename labels in an inlined function definition. goto statements to them will be updated.
in the function body
- create local variables , one per formal argument, initialize each with the actual argument
- build a paramMap: mapping formal arguments (SgInitializedName) to new local variables (SgVariableSymbol)
- this pointer is handled similarly: create a local variable , initialized with the caller's this pointer
- replace variable references in the body with actual arguments // ReplaceParameterUseVisitor(paramMap).traverse(funbody_copy, postorder);
- insert a label to indicate the end of the inlined function body // rose_inline_end__

Limitations of this algorithm is not very clean:

It generates new local variables and a label.

Eligibility check edit

What can be inlined

a named function,
static member function, or
qualified name does not start with "::std:: " // skip std:: functions
non-virtual non-static member function, // skip virtual functions, static member functions cannot access this->data (non-static data). That is why we check non-static for this ptr case.
the function must be known (not through a function pointer or member function pointer). // empty function reference expression
the body of the function must already be visible in the current AST. // skip functions with empty body

postprocessing edit

inlineEverything.C has a clean up step to make the outlined code better

cleanupInlinedCode(sageProject);
- remove unused labels: SageInterface::removeUnusedLabels(top);
- remove jumps to next statement: SageInterface::removeJumpsToNextStatement(top);
- simpleCopyAndConstantPropagation() // In code with declarations such as "int foo = bar", where foo and bar are not modified, replace "foo" with "bar" and remove the declaration
- remove null statements: RemoveNullStatementsVisitor().traverse(top, postorder);
- move declaration to first use: MoveDeclarationsToFirstUseVisitor().traverse(top, postorder); e.g. int x__2 =7; w= x_2 +3; ==> w=7+3;
- doSubexpressionExpansionSmart () // Replaces all uses of a variable by its initialing expression. Requires that initname has an assign initializer Replaces all uses of initname in initname's scope by copy of its initializer expression. Then removes initname.
changeAllMembersToPublic(sageProject);

This step does lots of things and can easily trigger some bugs.

Even wrose, the cleanup step operates on the entire AST, including C++ headers.


 // Post-inline AST normalizations

 // DQ (6/12/2015): These functions first renames all variable (a bit heavy handed for my tastes)
 // and then (second) removes the blocks that are otherwise added to support the inlining.  The removal
 // of the blocks is the motivation for renaming the variables, but the variable renaming is
 // done evarywhere instead of just where the functions are inlined.  I think the addition of
 // the blocks is a better solution than the overly agressive renaming of variables in the whole
 // program.  So the best solution is to comment out both of these functions.  All test codes
 // pass (including the token-based unparsing tests).
 // renameVariables(sageProject);
 // flattenBlocks(sageProject);

    cleanupInlinedCode(sageProject);

// In code with declarations such as "int foo = bar", where foo and bar are
// not modified, replace "foo" with "bar" and remove the declaration

void simpleCopyAndConstantPropagation(SgNode* top) {
  FindReferenceVariablesVisitor().traverse(top, preorder);
  FindCopiesVisitor().traverse(top, preorder);
  FindUsedDeclarationsVisitor vis;
  vis.traverse(top, preorder);
  RemoveUnusedDeclarationsVisitor(vis.used_decls, set<SgFunctionDeclaration*>()).traverse(top, postorder);
}

Testing edit

Test directory with an example translator and test input files

https://github.com/rose-compiler/rose/tree/develop/tests/nonsmoke/functional/roseTests/astInliningTests

By looking into Makefile.am, the example translator's source code will generate an executable named "inlineEverything" in your buildtree.

translator: inlineEverything edit

This is the tool you can try to inline your sample code.

inlineEverything

The same Makefile.am's make check rules contain sample command lines to use the tool.

To test one individual input file (such as template_functions.C), type

make inlineEverything_template_functions.C.passed // TODO: udpate to new ways to trigger the tests

Command line options edit

inlineEverything --help

---------------------Tool-Specific Help-----------------------------------
This is a program transformation tool to inline function calls in your C/C++ or Fortran code.
Usage: inlineEverything -c [options] input.c


The optional options include: 
 -skip-postprocessing: Skip postprocessing which cleanups code
 -process-headers:     Process calls within header files
 -verbose:            Printout debugging information

postprocessing edit


--------------input----------------
bash-4.2$ cat specimen25_1.C
 template<typename T>
 void swap(T& x, T& y)
 {
   T tmp = x;
   x = y;
   y = tmp;
 }
 
int foo (int a, int b)
{
   swap(a,b);
}
 
int main()
{
}
 
----- command line  -------------
bash-4.2$ inlineEverything -c specimen26_1.C
 
-----------output: with postprocessing (cleanup) --------------
bash-4.2$ cat rose_specimen25_1.C 
template < typename T >
 void swap ( T & x, T & y )
 {
   T tmp = x;
   x = y;
   y = tmp;
 }
 
int foo(int a,int b)
{
{
    int tmp = a;
    a = b;
    b = tmp;
  }
}
 
int main()
{
}



// output without postprocessing: cleanup
template < typename T >
 void swap ( T & x, T & y )
 {
   T tmp = x;
   x = y;
   y = tmp;
 }

int foo(int a,int b)
{
{
    int &x__2 = a;
    int &y__3 = b;
    int tmp = x__2;
    x__2 = y__3;
    y__3 = tmp;
    rose_inline_end__4:
    ;
  }
}

int main()
{
}

Correctness checking edit

make check only check if the number of inlined functions is the same as expected.

# Note: must use the name convention of specimenXX_N.C , in which N is the number of function calls inlined.   
# The specimens are named so that the number between the "_" and next "." is the number of function calls that
# we expect this specimen to inline.
inlineEverything_specimens =                    \
        specimen01_1.C   ..




inlineEverything_test_targets = $(addprefix inlineEverything_, $(addsuffix .passed, $(inlineEverything_specimens)))
TEST_TARGETS += $(inlineEverything_test_targets)
$(inlineEverything_test_targets): inlineEverything_%.passed: % inlineEverything inlineEverything.conf
        @$(RTH_RUN)                                                                                             \
                TITLE="inlineEverything $< [$@]"                                                                \
                SPECIMEN="$(abspath $<)"                                                                        \
                NINLINE="$$(echo $(notdir $<) |sed --regexp-extended 's/specimen[0-9]+_([0-9]+).*/\1/')"        \
                TRANSLATOR="$$(pwd)/inlineEverything"                                                           \
                $(srcdir)/inlineEverything.conf $@


 cat inlineEverything.conf 
# Test configuration file (see "scripts/rth_run.pl --help" for details)
# Tests the inliner

# Run the tests in subdirectories for ease of cleanup.
subdir = yes

# Run the test and then make sure the output contains a certain string
cmd = ${VALGRIND} ${TRANSLATOR} -rose:verbose 0 ${SPECIMEN} -o a.out |tee ${TEMP_FILE_0}
cmd = grep "Test inlined ${NINLINE} function" ${TEMP_FILE_0}
cmd = cat -n rose_*
cmd = ./a.out

# Extra stuff that might be useful to specify in the makefile
title = ${TITLE}
disabled = ${DISABLED}
timeout = ${TIMEOUT}

examples edit

We use a set of increasingly more complex examples to explain the inlining algorithm used.

More example input and output files are available at:

https://github.com/chunhualiao/inliner-demo

naked call: no parameters in nor return output edit


extern int x;

void incrementX()
   {
      x++;
   }

int main()
   {
     incrementX();
     return x;
   }


//----------output, without postprocessing --------

extern int x;

void incrementX()
{
  x++;
}

int main()
{
   // the function body is copied here
  {              
    x++;
    rose_inline_end__2:   // a label for the end of a function is generated. 
    ;
  }
  
  return x;
}

//-----------output, with postprocessing for clean up
// unused label and empty statement are removed

extern int x;

void incrementX()
{
  x++;
}

int main()
{
   // the function body is copied here
  {              
    x++;
  }
  
  return x;
}

function with a return edit

// a function with a return
extern int x;


int incrementX()
{
  x++;
  return x; 
}

int main()
{
  incrementX();
  return x;
}

//---------- output without postprocessing

// a function with a return
extern int x;

int incrementX()
{
  x++;
  return x;
}

int main()
{
  {
    x++;
    {                    // a return statement is translated into a block, go to the exit point in the end
      x;
      goto rose_inline_end__2;
    }
    rose_inline_end__2:  // a label for the end of the function: the exit point
    ;
  }
  return x;
}

//-------- with postprocessing, the code look the same 

// a function with a return
extern int x;

int incrementX()
{
  x++;
  return x;
}

int main()
{
  {
    x++;
    {
      goto rose_inline_end__2;
    }
rose_inline_end__2:
    ;
  }
  return x;
}

function calls as two expressions edit

// input code -----------------------
int foo(int i) {
  return 5+i;
}

int main(int, char**) {
  int w;
  w = foo(1)+ foo(2);
  return w;
}

//--------------after inlining-----------------
// You can see that a temparory variable is used to capture the returned value of a function call.
// Then the temp variable is used to replace the original function call expression
int foo(int i)
{
  return 5 + i;
}

int main(int ,char **)
{
  int w;
  int rose_temp__4;
  {
    int i__2 = 1;
    {
      rose_temp__4 = 5 + i__2;
      goto rose_inline_end__3;
    }
    rose_inline_end__3:
    ;
  }
  int rose_temp__8;
  {
    int i__6 = 2;
    {
      rose_temp__8 = 5 + i__6;
      goto rose_inline_end__7;
    }
    rose_inline_end__7:
    ;
  }
  w = rose_temp__4 + rose_temp__8;
  return w;
}

//----- postprocessing does not simplify the code any further
int foo(int i)
{
  return 5 + i;
}

int main(int ,char **)
{
  int rose_temp__4;
  {
    {
      rose_temp__4 = 5 + 1;
      goto rose_inline_end__3;
    }
    rose_inline_end__3:
      ;
  }
  int rose_temp__8;
  {
    {
      rose_temp__8 = 5 + 2;
      goto rose_inline_end__7;
    }
rose_inline_end__7:
    ;
  }
  int w = rose_temp__4 + rose_temp__8;
  return w;
}

function call as a single expression edit

An optimized transformation:

not blindly generate a temp variable to capture the value of a function call.
instead directly reuse the original declaration of the lhs variable inside the function body


int foo(int i) {
  return 5+i;
}

int main(int, char**) {
  int w;
  w = foo(1);
  return w;
}


//-------------after inlining ----------

int foo(int i)
{
  return 5 + i;
}

int main(int ,char **)
{
  int w;
  {
    int i__2 = 1;
    {
      w = 5 + i__2;
      goto rose_inline_end__3;
    }
rose_inline_end__3:
    ;
  }
  return w;
}


//postprocessing does not simplify the code further.
int foo(int i)
{
  return 5 + i;
}

int main(int ,char **)
{
  int w;
  {
    {
      w = 5 + 1;
      goto rose_inline_end__3;
    } 
rose_inline_end__3:
    ;
  } 
  return w;
}

3 operand operations edit

the code is normalized.

#include <stdlib.h>

int foo() {
  exit (1);
  return 0;
}

int main(int, char**) {
  int w, x = 7;
  w = x == 8 ? foo() : 0;
  return w;
}

//----------- after inlining ---------------

#include <stdlib.h>

int foo()
{
  exit(1);
  return 0;
}

int main(int ,char **)
{
  int w;
  int x = 7;
  if (x == 8) {
    int rose_temp__4;
    {
      exit(1);
      {
        rose_temp__4 = 0;
        goto rose_inline_end__2;
      }
rose_inline_end__2:
      ;
    }
    w = rose_temp__4;
  }
  else {
    w = 0;
  }
  return w;
}

data member access function edit

#include <vector>
typedef int    Index_t ;

struct Domain
{
  public:
    // non-reference type
    Index_t  numNode()            { return m_numNode ; }

    void AllocateNodeElemIndexes()
    {
      Index_t numNode = this->numNode() ;
    }

#if 0  // the best inline result should look like the following
    void AllocateNodeElemIndexes_inlined()
    {
      Index_t numNode = m_numNode; // call site 1 inlined
    }
#endif

  private:
    Index_t   m_numNode ;
} domain;

//----------------------------after inlining ----------------

#include <vector>

typedef int Index_t;

struct Domain {

  // non-reference type
  inline Index_t numNode()
  {
    return (this) -> m_numNode;
  }

  inline void AllocateNodeElemIndexes()
  {

//x. split declaration + initializer into two parts
// a temporary variable to transfer value of initializer

    Index_t rose_temp__3;

//x. a new code block to embed the function body
    {
      struct Domain *this__1 = this__1;
      {
        rose_temp__3 = this__1 -> m_numNode;

//x. goto the label after function call
        goto rose_inline_end__2;
      }

//x. label after the function call
rose_inline_end__2:
      ;
    }

    Index_t numNode = rose_temp__3;
  }

  Index_t m_numNode;

} domain;

C++ template functions edit

--------------input----------------
bash-4.2$ cat specimen25_1.C
 template<typename T>
 void swap(T& x, T& y)
 {
   T tmp = x;
   x = y;
   y = tmp;
 }
 
int foo (int a, int b)
{
   swap(a,b);
}
 
int main()
{
}
 
----- command line  -------------
bash-4.2$ inlineEverything -c -skip-postprocessing specimen26_1.C
 
-----------output: with postprocessing (cleanup) --------------
// output without postprocessing: cleanup
template < typename T >
 void swap ( T & x, T & y )
 {
   T tmp = x;
   x = y;
   y = tmp;
 }

int foo(int a,int b)
{
{
    int &x__2 = a;    // local variables for each formal arguments, initialized with actual arguments
    int &y__3 = b;

    int tmp = x__2;   // variable references are replace with the local variables
    x__2 = y__3;
    y__3 = tmp;
    rose_inline_end__4:   // a label to indicate the end of the outlined function body. 
    ;
  }
}

int main()
{
}

multiple-level function calls edit

//------------input 

int foo(int x) {
  return x + 3;
}

int bar(int y) {
  return foo(y) + foo(2);
}

int main(int, char**) {
  int w;
  w = bar(1);
  return 0;
}

//--------------- output, no postprocessing ------------

int
foo (int x)
{
  return x + 3;
}

int
bar (int y)
{
  int rose_temp__4;
  {
    int x__2 = y;
    {
      rose_temp__4 = x__2 + 3;
      goto rose_inline_end__3;
    }
  rose_inline_end__3:
    ;
  }
  int rose_temp__8;
  {
    int x__6 = 2;
    {
      rose_temp__8 = x__6 + 3;
      goto rose_inline_end__7;
    }
  rose_inline_end__7:
    ;
  }
  return rose_temp__4 + rose_temp__8;
}

int
main (int, char **)
{
  int w;
  {
    int y__10 = 1;
    int rose_temp__4;
    {
      int x__2 = y__10;
      {
	rose_temp__4 = x__2 + 3;
	goto rose_inline_end__3__1;
      }
    rose_inline_end__3__1:
      ;
    }
    int rose_temp__8;
    {
      int x__6 = 2;
      {
	rose_temp__8 = x__6 + 3;
	goto rose_inline_end__7__2;
      }
    rose_inline_end__7__2:
      ;
    }
    {
      w = rose_temp__4 + rose_temp__8;
      goto rose_inline_end__11;
    }
  rose_inline_end__11:
    ;
  }
  return 0;
}

Tutorial edit

Official documentation about how to call the inlining API is:

Chapter 36 "Calling the Inliner of ROSE" tutorial: http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf

Troubleshooting edit

TEST inlineEverything ../../../../../../sourcetree/tests/nonsmoke/functional/roseTests/astInliningTests/template_functions.C [inlineEverything_template_functions.C.passed]

inlineEverything_template_functions.C [out]: Note: C++11 input files to ROSE are NOT supported using EDG 4.9 configuration with GNU compilers 4.9 and greater (configure ROSE using EDG 4.12)