ROSE Compiler Framework/Abstract handles
Overview
editThis work is used in the autotuning and also other tools that pass references to source code as part of an interface.
- essentially it defines ways to create a string to uniquely identify a language construct in your source code
- any tool then can locate the corresponding node in AST and do the targetted analysis or transformation for you.
Abstract handles support the following forms for specifying language constructs:
- Source file position information including path, filename, line and column number etc. GNU standard source position from http://www.gnu.org/prep/standards/html_node/Errors.html presents some examples.
- Global or local numbering of specified language construct in source file (e.g. 2nd ”do” loop in a global scope). The file is itself specified using an abstract handle (typically generated from the file name).
- Global or local names of constructs. Some language constructs, such as files, function definitions and namespace, have names which can be used as their handle within a context.
- Language-specific label mechanisms. These include named constructs in Fortran, numbered labels in Fortran, and statement labels in C and C++, etc.
In addition to human-readable forms, compilers and tools can generate internal IDs for language constructs. It is up to compiler/tool developers to provide a way to convert their internal representations into human-readable formats.
Abstract Handles can have any of the human-readable or machine-generated forms. A handle can be used alone or combined with other handles to specify a language construct. A handle can also be converted from one form to another (e.g. from a compiler specific form to an human readable form relative to the source position; filename, line number, etc.). Abstract handles can have different lifetimes depending on their use and implementation. An abstract handle might be required to be persistent if it is used to reference a language construct that would be optimized over multiple executions of one or more different tools. Where as an abstract-handle might be internally generated only for purposes of optimizations used in a single execution (e.g. optimization within a compiler).
Use case
editA typical use can for Abstract Handles might be for a performance tool to identify a collection of loops in functions that are computationally intensive and to construct Abstract Handles that refer to this specific loops. Then pass the Abstract Handles to a second tool that might analyze the source code and/or the binary executable to evaluate if the computational costs are reasonable or if optimizations might be possible. The specific goal of the Abstract Handles is to support these sorts of uses within autotuning using diverse tools used and/or developed as part of autotuning research within the DOE SciDAC PERI project.
Syntax
editA possible grammar for abstract handles could be:
- A handle is a single handle item or a link of them separated by ::, or other delimiters
handle ::= handle_item | handle ’::’ handle_item
- Each handle item consists of construct_type and a specifier. Or it can be a compiler generated id of any forms.
handle_item ::= construct_type specifier | compiler_generated_handle
- Construct types are implementation dependent. An implementation can support a subset of legal constructs or all of them. We define a minimum set of common construct type names here and will grow this list as needed.
construct_type ::= Project|SourceFile|FunctionDeclaration|ForStatement|...
- A specifier is used to locate a particular construct e.g: <name, "foo">
specifier::= ’<’ specifier_type ’,’ specifier_value ’>’
- Tokens for specifier types could be name, position, numbering, label, etc. specifier type is necessary to avoid ambiguity for specifier values, because a same value could be interpreted in different specifier types otherwise
specifier_type::= name | position | numbering | label
- Possible values for a specifier
specifier_value::= string_lit|int_lit|position_value| label_value
- A label could be either integer or string
label_value::= int_lit | string_lit
- Start and end source line and column information
e.g.: 13.5-55.4, 13, 13.5 , 13.5-55
position_value:: = line_number[ ’.’ column_number][ ’-’ line_number[ ’.’ column_number]]
- Integer value: one or more digits
int_lit ::= [0-9]+
- String value: start with a letter, followed by zero or more letters or digits
string_lit ::= [a-z][a-z0-9]*
Reference Implementation
editAbstract Handles are fundamentally compiler and tool independent, however to clarify the con- cepts, provide meaningful examples, a working reference implementation we have provided a reference implementation in ROSE. The source files are located in src/midend/abstractHandle in the ROSE distribution. A generic interface (abstract handle.h and abstract handle.cpp) pro- vides data structures and operations for manipulating abstract handles using source file positions, numbering, or names. Any compilers and tools can have their own implementations using the same interface.
Connecting to ROSE
editA ROSE adapter (roseAdapter.h and roseAdapter.cpp) using the interface is provided as a concrete implementation for the maximum capability of the implementation (within a source- to-source compiler).
The follow image shows an input source file.
/∗ test input for generated abstract handles ∗/ int a[100][100][100]; void foo() { int i,j,k; for (i=0;i++;i<100) for (j=0;j++;j<100) for (k=0;k++;k<100) a [ i ] [ j ] [ k]= i+j+k ; for (i=0;i++;i<100) for (j=0;j++;j<100) for (k=0;k++;k<100) a [ i ] [ j ] [ k]+=5; }
The follow image shows the code (using ROSE) to generate abstract handles for loops in the input source file.
/∗ Example code to generate abstract handles for language constructs by Liao, 10/6/2008∗/ #include “rose.h” #include <iostream> #include “abstract_handle.h” #include “roseAdapter.h” #include <string.h> using namespace std ; using namespace AbstractHandle ; // a global handle for the current file static abstract handle∗ file handle = NULL; class visitorTraversal : public AstSimpleProcessing { protected: virtual void visit (SgNode∗ n); }; void visitorTraversal :: visit (SgNode∗ n) { SgForStatement∗ forloop = isSgForStatement(n); if (forloop) { cout<<”Creating handles for a loop construct...”<<endl; // Create an abstract node abstract node∗ anode= buildroseNode(forloop ); // Create an abstract handle from the abstract node // Using source position specifiers by default abstract handle ∗ ahandle = new abstract handle (anode ); cout<<ahandle−>toString()<<endl ; // Create handles based on numbering specifiers within the file abstract handle ∗ bhandle = new abstract handle (anode , e_numbering , file_handle ); cout<<bhandle−>toString()<<endl<<endl ; } } int main(int argc, char ∗ argv[]) { // Initialize and check compatibility. See Rose:: initialize ROSE INITIALIZE; SgProject ∗project = frontend (argc , argv ); // Generate a f i l e handle abstract node ∗ file node = buildroseNode((project−>get fileList())[0]); file handle = new abstract handle(file node); // Generate handles for language constructs visitorTraversal myvisitor ; myvisitor.traverseInputFiles(project ,preorder); // Generate source code from AST and c a l l the vendor ’ s return backend( project ); }
Abstract handle constructors generate handles from abstract nodes, which are implemented using ROSE AST nodes. Source position is used by default to generate a handle item. Names or numbering are used instead when source position information is not available. The Constructor can also be used to generate a handle item using a specified handle type (numbering handles in the example). The follow image is the output showing the generated handles for the loops.
Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 .cpp>::ForStatement<position ,7.3−10.25> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 . cpp >:: ForStatement<numbering ,1> Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mgr/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 .cpp>::ForStatement<position ,8.5−10.25> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 . cpp >:: ForStatement<numbering ,2> Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mgr/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 .cpp>::ForStatement<position ,9.7−10.25> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 . cpp >:: ForStatement<numbering ,3> 24 Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mgr/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 .cpp>::ForStatement<position ,12.3−15.22> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 . cpp >:: ForStatement<numbering ,4> Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mgr/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHa 36 ndle1 .cpp>::ForStatement<position ,13.5−15.22> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 . cpp >:: ForStatement<numbering ,5> Creating handles for a loop construct . . . Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mgr/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHandle1 .cpp>::ForStatement<position ,14.7−15.22> Project<numbering,1>::FileList<numbering,1>::SourceFile<name,/export/tmp.rose−mg r/jenkins/edg4x/workspace/release−ROSE−docs−weekly/tutorial/inputCode AbstractHa ndle1 . cpp >:: ForStatement<numbering ,6>
Connecting to External Tools
editA third example is provided to demonstrate how to use the abstract interface with any other tools, which may have less features in terms of supported language constructs and their correlations compared to a compiler. Assume a tool operating on some simple for-loops within an arbitrary source file (the input file is not shown in this example). Such a tool might have an internal data structure representing loops; such as that in given in the next figure.
/∗ A toy loop data structure demonstrating a thin client of abstract handles: ∗ A simplest loop tool which keeps a tree of loops in a file ∗/ #ifndef my loop INCLUDED #define my loop INCLUDED #include <string> #include <vector> class MyLoop { public : std::string sourceFileName; size t line number ; std : : vector<MyLoop∗> children ; MyLoop∗ parent ; }; #endif
We will show how the tool specific data structure for loops can be used to generate abstract handles and output as strings that can be used by other tools which use abstract handles (which would generate the abstract handles by reading the strings).
An adapter (loopAdapter.h and loopAdapter.cpp) using the proposed abstract handle in- terface is given in src/midend/abstractHandle. It provides a concrete implementation for the interface for the simple loops and adds a node to support file nodes (Compared to a full-featured IR for a compiler, the file node is an additional detail for tools without data structures to support files). The test program is given in next Figure.
#include <iostream> #include <string> #include <vector> #include ”abstract handle .h” #include ”myloop . h” ”loopAdapter .h” using namespace std ; using namespace AbstractHandle ; 10 int main() { //−−−−−−−−−−−−−Preparing the internal loop representation−−−−−−−−− // declare and initialize a list of loops using MyLoop // The loop tool should be able to generate its representation from // source code somehow. We fill it up manually here. vector <MyLoop∗ > loops; MyLoop loop1 , loop2 , loop3; loop1.sourceFileName=”file1.c”; loop1.linenumber = 7; loop1.parent = NULL; loop2.sourceFileName=”file1.c”; loop2.linenumber = 8; loop2.parent = &loop1; loop1.children.pushback(&loop2 ); loop3.sourceFileName=”file1.c”; loop3.linenumber = 12; loop3.parent=NULL; loops.pushback(&loop1); loops.pushback(&loop3); //−−−−−−−−−−−−−−−−−− using abstract handles −−−−−−−−−−−−− //Generate the abstract handle for the source file fileNode∗ filenode = new fileNode(”file1 .c”); filenode−>setMLoops(loops); abstract handle∗ file handle = new abstract handle(filenode); cout<<”Created a file handle:”<<endl<<file handle−>toString()<<endl; //Create a loop handle within the f i l e using numbering info. abstract node∗ loop node1 = new loopNode(&loop1); abstract handle∗ loop handle1 = new abstract handle(loop node1 ,e_numbering , file_handle); cout<<”Created a loop handle:”<<endl<<loop handle1−>toString()<<endl; //Create another loop handle within a file using its source position information string input1(”ForStatement<position ,12>”); abstract handle∗ loop handle2 = new abstract handle ( file_handle , input1 ); cout<<”Created a loop handle:”<<endl<<loop handle2−>toString()<<endl; //Create yet another loop handle within a loop using its relative numbering information string input2(”ForStatement<numbering,1>”); abstract handle∗ loop handle3 = new abstract handle(loop handle1 ,input2); cout<<”Created a loop handle:”<<endl<<loop handle3−>toString()<<endl; return 0 ; }
Again, it creates a top level file handle first. Then a loop handle (loop handle1) is created within the file handle using its relative numbering information. The loop handle2 is created from its string format using file position infor- mation (using GNU standard file position syntax). The loop handle3 uses its relative numbering information within loop handle1.
The output of the program is shown in the follow image. It demonstrates the generated strings to represent the abstract handles in the arbitrary code operated upon by the tool. Interoperability is achieved by another tool reading in the generated string representation to generate an abstract handle to the same source code language construct.
bash −3.00: ./testMyLoop Created a file handle: SourceFile<name, file1 .c> Created a loop handle : SourceFile<name, file1 .c>::ForStatement<numbering,1> Created a loop handle : SourceFile<name, file1 .c>::ForStatement<position ,12> Created a loop handle : SourceFile<name, file1 .c>::ForStatement<numbering,1>::ForStatement<numbering,1>
Examples
editLanguage handles using the syntax
editWe give some examples of language handles using the syntax mentioned above. Canonical AST’s node type names are used as the construct type names. Other implementations can use their own construct type names.
- A file handle consisting of only one handle item:
SourceFile<name,"/home/PERI/test111.f">
- A function handle using a named handle item, combined with a parent handle using a name also:
SourceFile<name,"/home/PERI/test111.f">::FunctionDeclaration<name,"foo”>
- A function handle using source position(A function starting at line 12, column 1 till line 30, column 1 within a file):
SourceFile<name,"/home/PERI/test111.f">::FunctionDeclaration<position,"12.1-30.1">
- A function handle using numbering(The first function definition in a file):
SourceFile<name,/home/PERI/test111.f">::FunctionDeclaration<numbering,1>
- A return statement using source position (A return statement at line 100):
SourceFile<name,/home/PERI/test222.c>::ReturnStatement<position,"100">
- A loop using numbering information (The second loop in function main()):
SourceFile<name,"/home/PERI/test222.c">::FunctionDeclaration<name,"main">::ForStatement<numbering,2>
- A nested loop using numbering information (The first loop inside the second loop in func- tion main()):
SourceFile<name,"/home/PERI/test222.c">::FunctionDeclaration<name,"main">:: ForStatement<numbering,2>::ForStatement<numbering,1>
Input Code Examples
editOutline the for loop at line 12 of the input code
- ./outline -rose:outline:abstract_handle "ForStatement<position,12>" -c $(srcdir)/inputCode_OutlineLoop2.c
Outline the 2nd for statement within a function named initialize() from the input code
- ./outline -rose:outline:abstract_handle "FunctionDeclaration<name,initialize>::ForStatement<numbering,2>" -c $(srcdir)/inputCode_OutlineLoop2.c -rose:output rose_inputCode_OutlineLoop2b.c
Outline the statement at line 5 of the input code
- ./outline -rose:outline:abstract_handle "Statement<position,5>" -c $(srcdir)/declarations.c
Summary
editAbstract handles are low level mechanisms to support multiple tools to exchange references to source code. Several examples are used to present the different features of abstract handles. Importantly, the specification of abstract handles is tool independent. A reference implementa- tion is provided and is publically available within the ROSE compiler framework. We encourage debate on the pros and cons of this concept to support interoperability of tools which must pass references to source code between them. This work is expected to a small piece of the infrastructure to suport autotuning research.
Key info
edit- Documentation: Chapter 46 of ROSE Tutorial: http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
- Header file: https://github.com/rose-compiler/rose-develop/blob/master/src/midend/abstractHandle/abstract_handle.h
- Example use: https://github.com/rose-compiler/rose-develop/blob/master/projects/autoTuning/tests/Makefile.am
- http://rosecompiler.org/autoTuning.pdf Section 6.2 explains how parameterized loop transformations are used on specified loops, using abstract handles.
- https://github.com/rose-compiler/rose-develop/blob/master/tests/nonsmoke/functional/roseTests/astInterfaceTests/loopCollapsing.C a loop collapse tool accepting abstract handles
code sample
editHow to support abstract handles in your tool
#include "rose.h" #include <string> #include <iostream> #include "commandline_processing.h" using namespace std; using namespace AbstractHandle; int main(int argc, char * argv[]) { std::string handle; int factor =2; // command line processing //-------------------------------------------------- vector<std::string> argvList (argv, argv+argc); if (!CommandlineProcessing::isOptionWithParameter (argvList,"-rose:loopcollapse:","abstract_handle",handle, true) || !CommandlineProcessing::isOptionWithParameter (argvList,"-rose:loopcollapse:","factor",factor, true)) { cout<<"Usage: loopCollapsing inputFile.c -rose:loopcollapse:abstract_handle <handle_string> -rose:loopcollapse:factor N"<<endl; return 0; } // Retrieve corresponding SgNode from abstract handle //-------------------------------------------------- SgProject *project = frontend (argvList); SgStatement* stmt = NULL; ROSE_ASSERT(project != NULL); SgFilePtrList & filelist = project->get_fileList(); SgFilePtrList::iterator iter= filelist.begin(); for (;iter!=filelist.end();iter++) { SgSourceFile* sfile = isSgSourceFile(*iter); if (sfile != NULL) { // prepare a file handle first abstract_node * file_node = buildroseNode(sfile); ROSE_ASSERT (file_node); abstract_handle* fhandle = new abstract_handle(file_node); ROSE_ASSERT (fhandle); // try to match the string and get the statement handle std::string cur_handle = handle; abstract_handle * shandle = new abstract_handle (fhandle,cur_handle); // it is possible that a handle is created but no matching IR node is found if (shandle != NULL) { if (shandle->getNode() != NULL) { // get SgNode from the handle SgNode* target_node = (SgNode*) (shandle->getNode()->getNode()); ROSE_ASSERT(isSgStatement(target_node)); stmt = isSgStatement(target_node); break; } } } //end if sfile } // end for if (stmt==NULL) { cout<<"Cannot find a matching target from a handle:"<<handle<<endl; return 0; } //-------------------------------------------------- if (isSgForStatement(stmt)) { bool result=false; SgForStatement *target_loop = isSgForStatement(stmt); result = SageInterface::loopCollapsing(target_loop, factor); ROSE_ASSERT(result != false); } // Generate source code from AST and call the vendor's compiler return backend(project); }
reference
editTo cite the abstract handle work, please use the following paper in which the handles were first published:
- Chunhua Liao, Daniel J. Quinlan, Richard Vuduc and Thomas Panas, Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization, The 22nd International Workshop on Languages and Compilers for Parallel Computing (LCPC), Newark, Delaware, USA. October 8-10, 2009