ROSE Compiler Framework/Abstract Syntax Tree
The main intermediate representation of ROSE is its abstract syntax tree (AST). To use a programming language, you have to get familiar with the language syntax, semantics, etc. To use ROSE, you have to get familiar with its internal representation of an input code.
The best way to know AST is to visualize it using simplest code samples.
Visualization of AST
editOverview
editThree things are needed to visualize ROSE AST:
- Sample input code: you provide it
- a dot graph generator to generate a dot file from AST: ROSE provides dot graph generators
- a visualization tool to open the dot graph: ZGRViewer and Graphviz are used by ROSE developers
If you don't want to install ROSE+ZGRview + Graphvis from scratch, you can directly use ROSE virtual machine image, which has everything you need installed and configured so you can just visualize your sample code.
Sample input code
editPlease prepare simplest input code without including any headers so you can get a small enough AST to digest.
Dot Graph Generator
editWe provide ROSE_INSTALLATION_TREE/bin/dotGeneratorWholeASTGraph (complex graph) and dotGenerator (a simpler version) to generate a dot graph of the detailed AST of input code.
Tools to generate AST graph in dot format. There are two versions
- dotGenerator: simple AST graph generator showing essential nodes and edges
- dotGeneratorWholeASTGraph: whole AST graph showing more details. It provides filter options to show/hide certain AST information.
command line:
dotGeneratorWholeASTGraph yourcode.c // it is best to avoid including any header into your input code to have a small enough tree to visualize.
To skip builtin functions
- dotGeneratorWholeASTGraph -DSKIP_ROSE_BUILTIN_DECLARATIONS yourcode.c
dotGeneratorWholeASTGraph -rose:help -rose:help show this help message -rose:dotgraph:asmFileFormatFilter [0|1] Disable or enable asmFileFormat filter -rose:dotgraph:asmTypeFilter [0|1] Disable or enable asmType filter -rose:dotgraph:binaryExecutableFormatFilter [0|1] Disable or enable binaryExecutableFormat filter -rose:dotgraph:commentAndDirectiveFilter [0|1] Disable or enable commentAndDirective filter -rose:dotgraph:ctorInitializerListFilter [0|1] Disable or enable ctorInitializerList filter -rose:dotgraph:defaultFilter [0|1] Disable or enable default filter -rose:dotgraph:defaultColorFilter [0|1] Disable or enable defaultColor filter -rose:dotgraph:edgeFilter [0|1] Disable or enable edge filter -rose:dotgraph:expressionFilter [0|1] Disable or enable expression filter -rose:dotgraph:fileInfoFilter [0|1] Disable or enable fileInfo filter -rose:dotgraph:frontendCompatibilityFilter [0|1] Disable or enable frontendCompatibility filter -rose:dotgraph:symbolFilter [0|1] Disable or enable symbol filter -rose:dotgraph:emptySymbolTableFilter [0|1] Disable or enable emptySymbolTable filter -rose:dotgraph:typeFilter [0|1] Disable or enable type filter -rose:dotgraph:variableDeclarationFilter [0|1] Disable or enable variableDeclaration filter -rose:dotgraph:variableDefinitionFilter [0|1] Disable or enable variableDefinitionFilter filter -rose:dotgraph:noFilter [0|1] Disable or enable no filtering Current filter flags' values are: m_asmFileFormat = 0 m_asmType = 0 m_binaryExecutableFormat = 0 m_commentAndDirective = 1 m_ctorInitializer = 0 m_default = 1 m_defaultColor = 1 m_edge = 1 m_emptySymbolTable = 0 m_expression = 0 m_fileInfo = 1 m_frontendCompatibility = 0 m_symbol = 0 m_type = 0 m_variableDeclaration = 0 m_variableDefinition = 0 m_noFilter = 0
Dot Graph Visualization
editTo visualize the generated dot graph, you have to install
- Graphviz: http://www.graphviz.org/Download.php .
- ZGRViewer: http://zvtm.sourceforge.net/zgrviewer.html#download. (Version 0.8.x is recommended since 0.9.x has a bugs like inversed (reversed) direction to drag a graph around.)
Please note that you have to configure ZGRViewer to have correct paths to some commands it uses. You can do it from its configuration/setting menu item. Or directly modify the text configuration file (.zgrviewer).
One example configuration is shown below (cat .zgrviewer)
<?xml version="1.0" encoding="UTF-8"?> <zgrv:config xmlns:zgrv="http://zvtm.sourceforge.net/zgrviewer"> <zgrv:directories> <zgrv:tmpDir value="true">/tmp</zgrv:tmpDir> <zgrv:graphDir>/home/liao6/svnrepos</zgrv:graphDir> <zgrv:dot>/home/liao6/opt/graphviz-2.18/bin/dot</zgrv:dot> <zgrv:neato>/home/liao6/opt/graphviz-2.18/bin/neato</zgrv:neato> <zgrv:circo>/home/liao6/opt/graphviz-2.18/bin/circo</zgrv:circo> <zgrv:twopi>/home/liao6/opt/graphviz-2.18/bin/twopi</zgrv:twopi> <zgrv:graphvizFontDir>/home/liao6/opt/graphviz-2.18/bin</zgrv:graphvizFontDir> </zgrv:directories> <zgrv:webBrowser autoDetect="true" options="" path=""/> <zgrv:proxy enable="false" host="" port="80"/> <zgrv:preferences antialiasing="false" cmdL_options="" highlightColor="-65536" magFactor="2.0" saveWindowLayout="false" sdZoom="false" sdZoomFactor="2" silent="true"/> <zgrv:plugins/> <zgrv:commandLines/> </zgrv:config>
You have to configure the run.sh script to have correct path also
cat run.sh
#!/bin/sh # If you want to be able to run ZGRViewer from any directory, # set ZGRV_HOME to the absolute path of ZGRViewer's main directory # e.g. ZGRV_HOME=/usr/local/zgrviewer ZGRV_HOME=/home/liao6/opt/zgrviewer-0.8.1 java -jar $ZGRV_HOME/target/zgrviewer-0.8.1.jar "$@"
Example session
editA complete example
# make sure the environment variables(PATH, LD_LIBRARY_PATH) for the installed rose are correctly set which dotGeneratorWholeASTGraph ~/workspace/masterClean/build64/install/bin/dotGeneratorWholeASTGraph # run the dot graph generator dotGeneratorWholeASTGraph -c ttt.c #see it which run.sh ~/64home/opt/zgrviewer-0.8.2/run.sh run.sh ttt.c_WholeAST.dot
example output
editWe put some example source files and their AST dump files into: https://github.com/chunhualiao/rose-ast
- for example: https://github.com/chunhualiao/rose-ast/blob/master/func1.c_WholeAST.dot.png
- https://github.com/chunhualiao/rose-ast/blob/master/parallelfor.c_WholeAST.dot.png
Print AST as horizontal tree
editSageInterface functions
// You can call the following functions with gdb //! Pretty print AST horizontally, output to std output void SageInterface::printAST (SgNode* node); //! Pretty print AST horizontally, output to a specified text file void SageInterface::printAST (SgNode* node, const char* filename); //! Pretty print AST horizontally, output to a specified text file. void SageInterface::printAST2TextFile (SgNode* node, const char* filename, bool printTypes=true);
A translator (textASTGenerator) is also available, with its source code under exampleTranslators/defaultTranslator .
- make install-tools will install this tool
- textASTGenerator input.c will generate a text output of the entire AST
Example use inside of gdb
edit- to print a portion of AST to the screen
- to print a portion of AST into a text file
(gdb) up #7 0x00007ffff418ab5d in Unparse_ExprStmt::unparseExprStmt (this=0x1a1bf950, stmt=0x7fffda63ce30, info=...) at ../../../sourcetree/src/backend/unparser/CxxCodeGeneration/unparseCxx_statements.C:9889 (gdb) p SageInterface::printAST(stmt) └──@0x7fffda63ce30 SgExprStatement transformation 0:0 └──@0x7fffd8488790 SgFunctionCallExp transformation 0:0 ├──@0x7fffe6211910 SgMemberFunctionRefExp transformation 0:0 └──@0x7fffd7f2c370 SgExprListExp transformation 0:0 └──@0x7fffd8488720 SgFunctionCallExp transformation 0:0 ├──@0x7fffe6211988 SgMemberFunctionRefExp transformation 0:0 └──@0x7fffd7f2c3d8 SgExprListExp transformation 0:0 $2 = void (gdb) up 10 #48 0x00007ffff40dce69 in Unparser::unparseFile (this=0x7fffffff8c60, file=0x7fffeb786010, info=..., unparseScope=0x0) at ../../../sourcetree/src/backend/unparser/unparser.C:945 (gdb) p SageInterface::printAST2TextFile(file,"test.txt")
textASTGenerator
editExample command line use:
textASTGenerator -c test_qualifiedName.cpp
cat test_qualifiedName.cpp.AST.txt
└──@0x7fe9f1916010 SgProject └──@0xb45730 SgFileList └──@0x7fe9f17be010 SgSourceFile ├──@0x7fe9fdf19120 SgGlobal test_qualifiedName.cpp 0:0 │ ├──@0x7fe9f159a010 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0 │ │ └── NULL │ ├──@0x7fe9f159a390 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0 │ │ └── NULL │ ├──@0x7fe9f0f59010 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::feclearexcept" │ │ ├──@0x7fe9f1391010 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0 │ │ │ └──@0x7fe9f1258010 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts" │ │ │ └── NULL │ │ ├── NULL │ │ └── NULL │ ├──@0x7fe9f0f59540 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::fegetexceptflag" │ │ ├──@0x7fe9f1391630 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0 │ │ │ ├──@0x7fe9f1258420 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__flagp" │ │ │ │ └── NULL │ │ │ └──@0x7fe9f1258628 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts" │ │ │ └── NULL │ │ ├── NULL │ │ └── NULL ... │ └──@0x7fe9eff218c0 SgFunctionDeclaration test_qualifiedName.cpp 14:1 "::foo" │ ├──@0x7fe9ef5e0320 SgFunctionParameterList test_qualifiedName.cpp 14:1 │ │ ├──@0x7fe9ef495278 SgInitializedName test_qualifiedName.cpp 14:13 "x" │ │ │ └── NULL │ │ └──@0x7fe9ef495480 SgInitializedName test_qualifiedName.cpp 14:20 "y" │ │ └── NULL │ ├── NULL │ └──@0x7fe9ee8f3010 SgFunctionDefinition test_qualifiedName.cpp 15:1 │ └──@0x7fe9ee988010 SgBasicBlock test_qualifiedName.cpp 15:1 │ ├──@0x7fe9eee1ba90 SgVariableDeclaration test_qualifiedName.cpp 16:3 │ │ ├── NULL │ │ └──@0x7fe9ef495688 SgInitializedName test_qualifiedName.cpp 16:3 "z" │ │ └── NULL │ ├──@0x7fe9ee7ad010 SgExprStatement test_qualifiedName.cpp 17:3 │ │ └──@0x7fe9ee7dc010 SgAssignOp test_qualifiedName.cpp 17:5 │ │ ├──@0x7fe9ee8c0010 SgVarRefExp test_qualifiedName.cpp 17:3 │ │ └──@0x7fe9ee813010 SgAddOp test_qualifiedName.cpp 17:9 │ │ ├──@0x7fe9ee8c0078 SgVarRefExp test_qualifiedName.cpp 17:7 │ │ └──@0x7fe9ee84a010 SgMultiplyOp test_qualifiedName.cpp 17:12 │ │ ├──@0x7fe9ee8c00e0 SgVarRefExp test_qualifiedName.cpp 17:11 │ │ └──@0x7fe9ee881010 SgIntVal test_qualifiedName.cpp 17:13 │ └──@0x7fe9ee77e010 SgReturnStmt test_qualifiedName.cpp 18:3 │ └──@0x7fe9ee8c0148 SgVarRefExp test_qualifiedName.cpp 18:10 ├── NULL ├── NULL └── NULL
Render the AST in HTML
editThe repo errington1/ast-to-html
contains a tool for rendering the Rose abstract syntax "graph" as collapsible HTML with shared nodes and cycles represented by HTML links. For now, it's available only from the command line. The plan is to add command-line options to omit parts of the tree and to make the tool available as a library. For now, it somewhat arbitrarily omit portions of the tree that originate from the file rose_edg_required_macros_and_functions.h
.
The command:
astToHTML file.C
will produce file.C.html
which can be viewed with a browser:
firefox file.C.html
Sanity Check
editWe provide a set of sanity check for AST. We use them to make sure the AST is consistent. It is also highly recommended that ROSE developers add a sanity check after their AST transformation is done. This has a higher standard than just correctly unparsed code to compilable code. It is common for an AST to unparse correctly but then fail on the sanity check.
The recommend sanity check is
- AstTests::runAllTests(project); from src/midend/astDiagnostics. Internally, it calls the following checks:
- TestAstForProperlyMangledNames
- TestAstCompilerGeneratedNodes
- AstTextAttributesHandling
- AstCycleTest
- TestAstTemplateProperties
- TestAstForProperlySetDefiningAndNondefiningDeclarations
- TestAstSymbolTables
- TestAstAccessToDeclarations
- TestExpressionTypes
- TestMangledNames::test()
- TestParentPointersInMemoryPool::test()
- TestChildPointersInMemoryPool::test()
- TestMappingOfDeclarationsInMemoryPoolToSymbols::test()
- TestLValueExpressions
- TestMultiFileConsistancy::test() //2009
- TestAstAccessToDeclarations::test(*i); // named type test
There are some other functions floating around. But they should be merged into AstTests::runAllTests(project)
- FixSgProject(*project); //in Qing's AST interface
- Utility::sanityCheck(SgProject* )
- Utility::consistencyCheck(SgProject*) // SgFile*
Text Output of an AST
editJust call: SgNode::unparseToString(). You can call it from any SgLocatedNode within the AST to dump partial AST's text format.
print AST as horizontal tree
editSageInterface functions
//! Pretty print AST horizontally, output to std output void SageInterface::printAST (SgNode* node); //! Pretty print AST horizontally, output to a specified text file. void SageInterface::printAST2TextFile (SgNode* node, const char* filename);
A translator (textASTGenerator) is also available, with its source code under exampleTranslators/defaultTranslator .
Example use inside of gdb:
- to print a portion of AST to the screen
- to print a portion of AST into a text file
(gdb) up #7 0x00007ffff418ab5d in Unparse_ExprStmt::unparseExprStmt (this=0x1a1bf950, stmt=0x7fffda63ce30, info=...) at ../../../sourcetree/src/backend/unparser/CxxCodeGeneration/unparseCxx_statements.C:9889 (gdb) p SageInterface::printAST(stmt) └──@0x7fffda63ce30 SgExprStatement transformation 0:0 └──@0x7fffd8488790 SgFunctionCallExp transformation 0:0 ├──@0x7fffe6211910 SgMemberFunctionRefExp transformation 0:0 └──@0x7fffd7f2c370 SgExprListExp transformation 0:0 └──@0x7fffd8488720 SgFunctionCallExp transformation 0:0 ├──@0x7fffe6211988 SgMemberFunctionRefExp transformation 0:0 └──@0x7fffd7f2c3d8 SgExprListExp transformation 0:0 $2 = void (gdb) up 10 #48 0x00007ffff40dce69 in Unparser::unparseFile (this=0x7fffffff8c60, file=0x7fffeb786010, info=..., unparseScope=0x0) at ../../../sourcetree/src/backend/unparser/unparser.C:945 (gdb) p SageInterface::printAST2TextFile(file,"test.txt")
Example command line use:
textASTGenerator -c test_qualifiedName.cpp
cat test_qualifiedName.cpp.AST.txt
└──@0x7fe9f1916010 SgProject └──@0xb45730 SgFileList └──@0x7fe9f17be010 SgSourceFile ├──@0x7fe9fdf19120 SgGlobal test_qualifiedName.cpp 0:0 │ ├──@0x7fe9f159a010 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0 │ │ └── NULL │ ├──@0x7fe9f159a390 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0 │ │ └── NULL │ ├──@0x7fe9f0f59010 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::feclearexcept" │ │ ├──@0x7fe9f1391010 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0 │ │ │ └──@0x7fe9f1258010 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts" │ │ │ └── NULL │ │ ├── NULL │ │ └── NULL │ ├──@0x7fe9f0f59540 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::fegetexceptflag" │ │ ├──@0x7fe9f1391630 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0 │ │ │ ├──@0x7fe9f1258420 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__flagp" │ │ │ │ └── NULL │ │ │ └──@0x7fe9f1258628 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts" │ │ │ └── NULL │ │ ├── NULL │ │ └── NULL ... │ └──@0x7fe9eff218c0 SgFunctionDeclaration test_qualifiedName.cpp 14:1 "::foo" │ ├──@0x7fe9ef5e0320 SgFunctionParameterList test_qualifiedName.cpp 14:1 │ │ ├──@0x7fe9ef495278 SgInitializedName test_qualifiedName.cpp 14:13 "x" │ │ │ └── NULL │ │ └──@0x7fe9ef495480 SgInitializedName test_qualifiedName.cpp 14:20 "y" │ │ └── NULL │ ├── NULL │ └──@0x7fe9ee8f3010 SgFunctionDefinition test_qualifiedName.cpp 15:1 │ └──@0x7fe9ee988010 SgBasicBlock test_qualifiedName.cpp 15:1 │ ├──@0x7fe9eee1ba90 SgVariableDeclaration test_qualifiedName.cpp 16:3 │ │ ├── NULL │ │ └──@0x7fe9ef495688 SgInitializedName test_qualifiedName.cpp 16:3 "z" │ │ └── NULL │ ├──@0x7fe9ee7ad010 SgExprStatement test_qualifiedName.cpp 17:3 │ │ └──@0x7fe9ee7dc010 SgAssignOp test_qualifiedName.cpp 17:5 │ │ ├──@0x7fe9ee8c0010 SgVarRefExp test_qualifiedName.cpp 17:3 │ │ └──@0x7fe9ee813010 SgAddOp test_qualifiedName.cpp 17:9 │ │ ├──@0x7fe9ee8c0078 SgVarRefExp test_qualifiedName.cpp 17:7 │ │ └──@0x7fe9ee84a010 SgMultiplyOp test_qualifiedName.cpp 17:12 │ │ ├──@0x7fe9ee8c00e0 SgVarRefExp test_qualifiedName.cpp 17:11 │ │ └──@0x7fe9ee881010 SgIntVal test_qualifiedName.cpp 17:13 │ └──@0x7fe9ee77e010 SgReturnStmt test_qualifiedName.cpp 18:3 │ └──@0x7fe9ee8c0148 SgVarRefExp test_qualifiedName.cpp 18:10 ├── NULL ├── NULL └── NULL
AST Iterator
edit1) The iterator class: The iterator follows the STL iterator pattern and is implemented as pre-order traversal and maintains its own stack. The iterator performs the exact same traversal as the traversal classes in ROSE (it is using the same underlying information):
#include "RoseAst.h" SgNode* node= .... // any subtree RoseAst ast(node); for(RoseAst::iterator i=ast.begin();i!=ast.end();++i) { cout<<"We are here:"<<(*i)->class_name()<<endl; }
Some more features:
- By default it is not traversing null pointers (you won't see them). However, if you want to see&traverse also all the null pointers, you can use the begin function with: ast.begin().withNullValues()
- It also has a feature to exclude subtrees from traversing during the traversal: You can simply call on the *iterator*:
- i.skipChildrenOnForward(); ++i; // skips the children of current node and goes to the next node that follows in the traversal after all those children
Relevant sourcefiles
Content of AST
editSgType
editSome useful member functions
- get_base_type() :member function on some IR nodes derived from SgType and returns the non-recursively striped (immediate) type under the typedefs, reference, pointers, arrays, modifiers, etc.
- findBaseType() recursively strip away all
typedefs, SgTypedefType reference, SgReferenceType pointers, SgPointerType arrays, SgArrayType modifiers SgModifierType
- SgType * stripType (unsigned char bit_array=STRIP_MODIFIER_TYPE|STRIP_REFERENCE_TYPE|STRIP_POINTER_TYPE|STRIP_ARRAY_TYPE|STRIP_TYPEDEF_TYPE) const
Returns hidden type beneath layers of typedefs, pointers, references, modifiers, array representation, etc.
- SgType * stripTypedefsAndModifiers () const
File location information
editAll AST nodes with file location information derive from SgLocatedNode, which has start and end Sg_File_Info to indicate begin and end location information.
You can obtain and printout the pair of location information by calling
locatedNode->get_startOfConstruct()->display() ; locatedNode->get_endOfConstruct()->display() ; // get beginning info only locatedNode->get_file_info()->display() ;
The output for display() may look like
Inside of Sg_File_Info::display(debug.......) isTransformation = false isCompilerGenerated = true (no position information) isOutputInCodeGeneration = false isShared = false isFrontendSpecific = true (part of ROSE support for gnu compatability) isSourcePositionUnavailableInFrontend = false isCommentOrDirective = false isToken = false file_id = 2 filename = /home/liao6/daily-test-rose/upcwork/install/include/gcc_HEADERS/rose_edg_required_macros_and_functions.h line = 167 column = 1 .... // transformation generated, will be outputted by the unparser upcr_pshared_ptr_t gsj; Inside of Sg_File_Info::display(debug.......) isTransformation = true (part of a transformation) isCompilerGenerated = false isOutputInCodeGeneration = true (output in code generator) isShared = false isFrontendSpecific = false isSourcePositionUnavailableInFrontend = false isCommentOrDirective = false isToken = false file_id = -3 filename = transformation line = 0 column = 0
As you can see, there are AST nodes generated by ROSE's frontends or by a translator. A transformation generated located node may not have line or column numbers.
You can get file name, line, column numbers
SgLocatedNode* node = .... ; Sg_File_Info* info_start = node->get_startOfConstruct (); size_t a_start = (size_t)info_start->get_line (); string filename = node->get_file_info()->get_filename(); Sg_File_Info* info_end = node->get_endOfConstruct (); size_t a_end = (info_end == NULL) ? a_start : info_end->get_line ();
Preprocessing Information
editSee more at ROSE Compiler Framework/PreprocessingInfo
In addition to nodes and edges, ROSE AST may have attributes in addition to nodes and edges that are attached for preprocessing information like #include or #if .. #else. They are attached before, after, or within a nearby AST node (only the one with source location information.)
An example translator will traverse the input code's AST and dump information which may include preprocessing information.
For example
exampleTranslators/defaultTranslator/preprocessingInfoDumper -c main.cxx ----------------------------------------------- Found an IR node with preprocessing Info attached: (memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in file /export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1) -------------PreprocessingInfo #0 ----------- : classification = CpreprocessorIncludeDeclaration: String format = #include "all_headers.h" relative position is = before
Source: http://www.rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf (Chapter 29 - Handling Comments, Preprocessor Directives, And Adding Arbitrary Text to Generated Code)
AST Construction
editSageBuilder and SageInterface namespaces provide functions to create ASTs and manipulate them. Doxygen docs