ROSE Compiler Framework/FAQ
We collect a list of frequently asked questions about ROSE, mostly from the rose-public mailing list link
General
editHow to search rose-public mailinglist for previously asked questions?
editUse the following command on google search
site:https://mailman.nersc.gov/pipermail/rose-public $(ADD_YOUR_SEARCH_TERM_HERE)
How to check the version of ROSE?
editROSE_Install_path/include/rose/rosePublicConfig.h
/* Define to the version of this package. */ #define ROSE_PACKAGE_VERSION "0.9.8.54"
To check this in your code
bool checkRoseVersionNumber(const std::string &need) { std::vector<std::string> needParts = rose::StringUtility::split('.', need); std::vector<std::string> haveParts = rose::StringUtility::split('.', ROSE_PACKAGE_VERSION); for (size_t i=0; i < needParts.size() && i < haveParts.size(); ++i) { if (needParts[i] != haveParts[i]) return needParts[i] < haveParts[i]; } // E.g., need = "1.2" and have = "1.2.x", or vice versa return true; }
Why can't ROSE staff members answer all my questions?
editIt can feel very frustrating when you get no responses to your questions submitted to the rose-public@nersc.gov mailing list. You may wonder why the ROSE staff cannot help neither sometimes.
Here are some possible excuses:
- They are just as busy as everybody else in the research and development fields. They may be working around the clock to meet deadlines for proposals, papers, project reviews, deliverables, etc.
- They don't know every corner of their own compiler, given the breadth and depth of contributions made to ROSE by collaborators, former staff members, post-docs, and interns. Moreover, most contributions lack good documentation--something that should be remedied in the future.
- Some questions are simply difficult and open research and development questions. They may have no clue, either.
- They just feel lazy sometimes or are taking a thing called vacation.
Possible alternatives to have your questions answered and your problems solved in a timely fashion:
- Please do you own homework first (e.g. Google).
- The ROSE team is actively addressing the documentation problem, through an internal code review process to enforce well-documented contributions going forward.
- Help others to help yourself. Answer questions on the rose-public@nersc.gov mailing list and contribute to this community-editable Wikibook.
- Find ways to formally collaborate with, or fund, the ROSE team. Things go faster when money is flowing :-) Sad, but true, reality in this busy world.
How many lines of source code does ROSE have?
editExcluding the EDG submodule and all source code comments, the core of ROSE (rose/src) has about 674,000 lines of C/C++ source code as of July 11, 2012.
Including tests, projects, and tutorial directories, ROSE has about 2 Million lines of code.
Some details are shown below:
[rose/src]./cloc-1.56.pl . 3076 text files. 2871 unique files. 716 files ignored. http://cloc.sourceforge.net v 1.56 T=26.0 s (91.7 files/s, 39573.3 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- C++ 908 75280 93960 354636 C 123 12010 3717 199087 C/C++ Header 915 28302 38412 121373 Bourne Shell 17 3346 4347 25326 Perl 4 743 1078 7888 Java 18 1999 4517 7096 m4 1 747 20 6489 Python 34 1984 1174 5363 make 148 1682 1071 3666 C# 11 899 274 2546 SQL 1 0 0 1817 Pascal 5 650 31 1779 CMake 168 1748 4880 1702 yacc 3 352 186 1544 Visual Basic 6 228 421 1180 Ruby 11 281 181 809 Teamcenter def 3 3 0 606 lex 2 103 47 331 CSS 1 95 32 314 Fortran 90 1 34 6 244 Tcl/Tk 2 29 6 212 HTML 1 8 0 15 ------------------------------------------------------------------------------- SUM: 2383 130523 154360 744023 -------------------------------------------------------------------------------
How large is ROSE?
editTo show top level information only (in MB): du -msl * | sort -nr
170 tests 109 projects 90 src 19 docs 16 winspecific 16 ROSE_ResearchPapers 15 binaries 7 scripts 5 LicenseInformation 4 tutorial 4 autom4te.cache 2 libltdl 2 exampleTranslators 2 configure 2 config 2 ChangeLog
Sort directories by their sizes in MegaBytes
du -m | sort -nr >~/size.txt
709 . 250 ./.git 245 ./.git/objects 243 ./.git/objects/pack 170 ./tests 109 ./projects 90 ./src 76 ./tests/CompileTests 50 ./tests/RunTests 40 ./tests/RunTests/FortranTests 34 ./tests/RunTests/FortranTests/LANL_POP 29 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1 27 ./src/3rdPartyLibraries 23 ./tests/roseTests 23 ./src/frontend 22 ./tests/CompileTests/Fortran_tests 21 ./tests/CompilerOptionsTests 19 ./docs 18 ./tests/CompileTests/RoseExample_tests 18 ./src/midend 18 ./docs/Rose 16 ./winspecific 16 ./ROSE_ResearchPapers 15 ./tests/CompileTests/Fortran_tests/gfortranTestSuite 15 ./binaries/samples 15 ./binaries 14 ./tests/CompileTests/Fortran_tests/gfortranTestSuite/gfortran.dg 14 ./src/roseExtensions 11 ./projects/traceAnalysis 10 ./tests/CompileTests/A++Code 10 ./tests/CompilerOptionsTests/testCpreprocessorOption 10 ./tests/CompilerOptionsTests/A++Code 10 ./src/roseExtensions/qtWidgets 10 ./src/frontend/Disassemblers 10 ./projects/symbolicAnalysisFramework 10 ./projects/SATIrE 10 ./projects/compass 9 ./winspecific/MSVS_ROSE 9 ./tests/RunTests/A++Tests 9 ./tests/roseTests/binaryTests 9 ./src/frontend/SageIII 9 ./projects/symbolicAnalysisFramework/src 9 ./docs/Rose/powerpoints 8 ./winspecific/MSVS_project_ROSETTA_empty 8 ./projects/simulator 7 ./tests/RunTests/FortranTests/LANL_POP_OLD 7 ./tests/CompileTests/Cxx_tests 7 ./src/midend/programTransformation 7 ./src/midend/programAnalysis 7 ./src/3rdPartyLibraries/libharu-2.1.0 7 ./scripts 7 ./projects/symbolicAnalysisFramework/src/mpiAnal 7 ./projects/RTC 6 ./winspecific/MSVS_ROSE/Debug 6 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/ncdap_test 6 ./tests/roseTests/programAnalysisTests 6 ./src/3rdPartyLibraries/ckpt 6 ./src/3rdPartyLibraries/antlr-jars 6 ./projects/SATIrE/src 5 ./tests/RunTests/FortranTests/LANL_POP/pop-distro 5 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/libcf 5 ./tests/CompileTests/ElsaTestCases 5 ./src/ROSETTA 5 ./src/3rdPartyLibraries/qrose 5 ./projects/DatalogAnalysis 5 ./projects/backstroke 5 ./LicenseInformation 5 ./docs/Rose/AstProcessing
To list files based on size
find . -type f -print0 | xargs -0 ls -s | sort -k1,1rn
241568 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.pack 13484 ./tests/CompileTests/RoseExample_tests/Cxx_Grammar.h 10240 ./projects/traceAnalysis/vmp-hw-part.trace 6324 ./tests/RunTests/FortranTests/LANL_POP_OLD/poptest.tgz 5828 ./winspecific/MSVS_ROSE/Debug/MSVS_ROSETTA.pdb 4732 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.idx 4488 ./binaries/samples/bgl-helloworld-mpicc 4488 ./binaries/samples/bgl-helloworld-mpixlc 4080 ./LicenseInformation/edison_group.pdf 3968 ./projects/RTC/tags 3952 ./src/frontend/Disassemblers/x86-InstructionSetReference-NZ.pdf 3908 ./tests/CompileTests/RoseExample_tests/trial_Cxx_Grammar.C 3572 ./winspecific/MSVS_project_ROSETTA_empty/MSVS_project_ROSETTA_empty.ncb 3424 ./src/frontend/Disassemblers/x86-InstructionSetReference-AM.pdf 2868 ./.git/index 2864 ./projects/compassDistribution/COMPASS_SUBMIT.tar.gz 2864 ./projects/COMPASS_SUBMIT.tar.gz 2740 ./ROSE_ResearchPapers/2007-CommunicatingSoftwareArchitectureUsingAUnifiedSingle-ViewVisualization-ICECC S.pdf 2592 ./docs/Rose/powerpoints/rose_compiler_users.pptx 2428 ./src/3rdPartyLibraries/ckpt/wrapckpt.c 2408 ./projects/DatalogAnalysis/jars/weka.jar 2220 ./scripts/graph.tar 1900 ./src/3rdPartyLibraries/antlr-jars/antlr-3.3-complete.jar 1884 ./src/3rdPartyLibraries/antlr-jars/antlr-3.2.jar 1848 ./src/midend/programTransformation/ompLowering/run_me_defs.inc 1772 ./src/3rdPartyLibraries/qrose/docs/QROSE.pdf 1732 ./tests/CompileTests/Cxx_tests/longFile.C 1724 ./src/midend/programTransformation/ompLowering/run_me_task_defs.inc 1656 ./ChangeLog 1548 ./tests/roseTests/binaryTests/yicesSemanticsExe.ans 1548 ./tests/roseTests/binaryTests/yicesSemanticsLib.ans 1480 ./ROSE_ResearchPapers/1997-ExpressionTemplatePerformanceIssues-IPPS.pdf 1408 ./docs/Rose/powerpoints/ExaCT_AllHands_March2012_ROSE.pptx ...
Compilation
editCannot download the EDG binary tar ball
editThree possible reasons
- the website hosting EDG binaries is down (there is a manual way to get the binary)
- we don't support the platform you use so there is no EDG binary is available for you.
- you cloned your rose from an un-official repo so the build process cannot figure out the right version of EDG binary for you. (there is a solution mentioned below)
It is possible that the rosecompiler.org website is down for maintenance.
So you may encounter the following error message:
make[3]: Entering directory `/home/leo/workspace/github-rose/buildtree/src/frontend/CxxFrontend' test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz . || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz --2012-08-05 12:58:29-- http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz Resolving www.rosecompiler.org... 128.55.6.204 Connecting to www.rosecompiler.org|128.55.6.204|:80... failed: No route to host. make[3]: *** [roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz] Error 4
In this case, you should ask for the missing tar ball or find it on our backup location
You don't have to clone the entire edge binary repo since it is big. You can just download the one you need (click raw file link on github.com).
Once you get the bar ball, copy it to your build tree's CxxFrontend subdirectory:
- buildtree/src/frontend/CxxFrontend
Then you should be able to normally build rose by typing make.
TODO: automate the search using the alternative path to obtain edg binary
Another possible reason is that you cloned your local rose repo from an unofficial repository.
- In order to maintain the correct matching between rose source and EDG binary, we require a canonical repository to be available.
make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend/Clang' Unable to find a remote tracking a canonical repository. Please add a canonical repository as a remote and ensure it is up to date. Currently configured remotes are: origin => git@xxx.com/myrose.git Potential canonical repositories include: anything ending with "rose.git" (case insensitive) Unable to find a remote tracking a canonical repository. Please add a canonical repository as a remote and ensure it is up to date. Currently configured remotes are: origin => git@xxx.com/myrose.git Potential canonical repositories include: anything ending with "rose.git" (case insensitive) make[3]: Entering directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz . || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz --2013-02-15 17:26:42-- http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz Resolving www.rosecompiler.org... 128.55.6.204 Connecting to www.rosecompiler.org|128.55.6.204|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2013-02-15 17:26:42 ERROR 404: Not Found. make[3]: *** [roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz] Error 1 make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend' make: *** [all-recursive] Error 1 make: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src'
Solution: add an official rose repo as an additional remote repo of your local repo
- add a canonical repository, like the one at github: git add remote official-rose https://github.com/rose-compiler/rose.git
- git fetch official-rose // to retrieve hash numbers etc in the canonical repository
- Now you can build rose again. it should find the canonical repo you just added and use it to find a matching EDG binary
How to access EDG or EDG-SAGE connection code?
editFrom page 5 of http://rosecompiler.org/ROSE_UserManual/ROSE-UserManual.pdf
The connection code that was used to translate EDG’s AST to SAGE III was derived loosely from the EDG C++ source generator and has formed the basis of the SAGE III translator from EDG to SAGE III’s IR.
Under the license we have, the EDG source code and the translation from the EDG AST in distributions are excluded from source release and are made available through a binary format. No part of the EDG work is visible to the user of ROSE. The EDG source are available only to those who have the EDG research or commercial license.
Chapter 2.6 "Getting a Free EDG License for Research Use" of the manual has instructions about how to obtain the EDG license.
Once you obtain the license, please contact the staff members of ROSE to verify your license. After that, they will give you more instructions about how to proceed.
How to speedup compiling ROSE?
editQuestion It takes hours to compile ROSE, how can I speed up this process?
Answer:
- if you have multi-core processors, try to use make -j4 (make by using four processes or even more if you like).
- also try to only build librose.so under src/ by typing make -C src/ -j4
- Or only try to build the language support you are interested in during configure, such as
- ../sourcetree/configure --enable-only-c # if you are only interested in C/C++ support
- ../sourcetree/configure --enable-only-fortran # if you are only interested in Fortran support
- ../sourcetree/configure --help # show all other options to enable only a few languages.
Can ROSE accept incomplete code?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-July/001015.html
ROSE does not handle incomplete code. Though this might be possible in the future. It would be language dependent and likely depend heavily on some of the language specific tools that we use internally. This is however, not really a priority for our work. If you want to for example demonstrate how some of the internal tools we are using or alternative tools that we could use might handle incomplete code, this might be interesting and we could discuss it.
For example, we are not presently using Clang, but if it handled incomplete code that might be interesting for the future. I recall that some of the latest EDG work might handle some incomplete code, and if that is true then that might be interesting as well. I have not attempted to handle incomplete code with OFP, so I am not sure how well that could be expected to work. Similarly, I don't know what the incomplete code handling capabilities of ECJ Java support is either. If you know any of these questions we could discuss this further.
I have some doubts about how much meaningful information can come from incomplete code analysis and so that would worry me a bit. I expect it is very language dependent and there would be likely some constraints on the incomplete code. So understanding the subject better would be an additional requirement for me.
Can ROSE analyze Linux Kernel sources?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-April/000856.html
Question: I'm trying to analyze the Linux kernel. I was not sure of the size of the code-base that can be handled by ROSE, and could not find references as to whether it has been tried on the Linux kernel source. As of now I'm trying to run the identity translator on the source, and would like to know if it can be done using ROSE, and if it has been successfully tested before.
Short answer: Not for now
Long answer: We are using EDG 3.3 internally by default and this version of EDG does not handle the GNU specific register modifiers used in the asm() statements of the Linux Kernel code. There might be other problems, but that was at least the one that we noticed in previous work on this some time ago. But we are working on upgrading the EDG frontend to be a more recent version 4.4.
Can ROSE compile C++ Boost library?
edithttps://mailman.nersc.gov/pipermail/rose-public/2010-November/000544.html
not yet.
I know of a few cases where ROSE can't handle parts of Boost. In each case it is an EDG problem where we are using an older version of EDG. We are trying to upgrade to a newer version of EDG (4.x), but that version's use within ROSE does not include enough C++ support, so it is not ready. The C support is internally tested, but we need more time to work on this.
AST
editHow to find XYZ in AST?
editThe usually steps to retrieve information from AST are:
- prepare a simplest (preferably 5-10 lines only), compilable sample code with the code feature you want to find (e.g array[i][j] if you are curious about how to find use of multi-dimensional arrays in AST), avoid including any headers (#include file.h) to keep the code small.
- Please note: don't include any headers in the sample code. A header (#include <stdio.h> for example) can bring in thousands of nodes into AST.
- use dotGeneratorWholeASTGraph to generate a detailed AST dot graph of the input code
- use zgrviewer-0.8.2's run.sh to visualize the dot graph
- visually/manually locate the information you want in the dot graph, understand what to look and where to look
Some sample AST graphs are available at https://github.com/chunhualiao/rose-ast
How to get children of an AST node?
editOnce you know how to find a child in the AST manually. You can use codes to walk the AST using AST member functions, traversal, or SageInteface functions, etc to retrieve the information you want
- ROSE provides member access functions like get_X() by default for a child named X. such as get_lhs_operand() for SgBinaryOp with a child named lhs_operand in the AST graph.
- The names are shown in AST graph as labels of edges from parents to children.
To get a child by index use the function (not recommended though):
virtual SgNode * get_traversalSuccessorByIndex (size_t idx)
and/or related, similarly named functions.
How to filter out header files from AST traversals?
edithttps://mailman.nersc.gov/pipermail/rose-public/2010-April/000144.html
Question: I want to exclude functions in #include files from my analysis/transformations during my processing.
By default, AST traversal may visit all AST nodes, including the ones come from headers.
So AST processing classes provide three functions :
- T traverse (SgNode * node, ..): traverse full AST , nodes which represent code from include files
- T traverseInputFiles(SgProject* projectNode,..) traverse the subtree of AST which represents the files specified on the command line
- T traverseWithinFile(SgNode* node,..): only the nodes which represent code of the same file as the start node
Should SgIfStmt::get_true_body() return SgBasicBlock?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-April/000930.html
Both true/false bodies were SgBasicBlock before.
Later, we decided to have more faithful representation of both blocked (with {...}) and single-statement (without { ..} ) bodies. So they are SgStatement (SgBasicBlock is a subclass of SgStatement) now.
But it seems like the document has not been updated to be consistent with the change.
You have to check if the body is a block or a single statement in your code. Or you can use the following function to ensure all bodies must be SgBasicBlock.
//A wrapper of all ensureBasicBlockAs*() above to ensure the parent of s is a scope statement with list of statements as children, otherwise generate a SgBasicBlock in between.
SgLocatedNode * SageInterface::ensureBasicBlockAsParent (SgStatement *s)
How to handle #include "header.h", #if, #define etc. ?
editIt is called preprocessing info. within ROSE's AST. They are attached before, after, or within a nearby AST node (only the one with source location information.)
An example translator is provided to traverse the input code's AST and dump information about the found preprocessing information. The source code of this translator is https://github.com/rose-compiler/rose/blob/master/exampleTranslators/defaultTranslator/preprocessingInfoDumper.C .
To use the translator:
buildtree/exampleTranslators/defaultTranslator/preprocessingInfoDumper -c main.cxx ----------------------------------------------- Found an IR node with preprocessing Info attached: (memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in file /export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1) -------------PreprocessingInfo #0 ----------- : classification = CpreprocessorIncludeDeclaration: String format = #include "all_headers.h" relative position is = before
SgClassDeclaration::get_definition() returns NULL?
editIf you look at the whole AST graph carefully, you can find defining and non-defining declarations for the same class.
A symbol is usually associated with a non-defining declaration. A class definition is associated with a defining declaration.
You may want to get the defining declaration from the non-defining declaration before you try to grab the definition, as in this function:
SgFunctionDefinition* getFunctionDefinitionFromDeclaration(const SgFunctionDeclaration* funcDecl) { //Get the defining declaration (we don't know if funcDecl is the defining or nonDefining declaration SgFunctionDeclaration* funcDefDecl = isSgFunctionDeclaration(funcDecl->get_definingDeclaration()); ROSE_ASSERT(funcDefDecl != NULL); //Get the definition from the defining declaration SgFunctionDefinition* funcDef = isSgFunctionDefinition(funcDefDecl->get_definition()); ROSE_ASSERT(funcDef != NULL); return funcDef; }
How to handle arrays?
editThe first step is to get familiar with the AST representing Array types (SgArrayType) and array references (SgPntrArrRefExp). Then you can retrieve the necessary information from the AST.
To understand array types and array references, Here is one example,
// cat ~/temp/array.c int a[5][10][15]; // array declaration, a type is declared int foo() { return a[0][1][2]; // a reference to array element }
An Array Type is represented by SgArrayType.
int a[5][10][15], corresponding three SgArrayType linked together
List a->get_type() will return the first one
- SgArrayType_1: (index=5, base_type = SgArrayType_2)
- SgArrayType_2: (index=10, base_type = SgArrayType_3)
- SgArrayType_3: (index=15, base_type = SgTypeInt )
So a traverse from the first to the element type will get all dimension sizes 5-10-15
The subtree looks like
SgArrayType_1 / \ 5 SgArrayType_2 / \ 10 SgArrayType_3 / \ 15 SgTypeInt
An array reference is represented by SgPntrArrRefExp
A reference like: a[0][1][2]
- SgPntrArrRefExp_1 <lhs= ref_2, rhs=2>
- SgPntrArrRefExp_2 <lhs= ref_3, rhs=1>
- SgPntrArrRefExp_3 <lhs= SgVarRefExp (a_symbol), rhs=0>
The subtree should look like the following:
a[0][1][2] //SgPntrArrRefExp / \ a[0][1] 2 // SgIntVal / \ a[0] 1 / \ a 0 SgVarRefExp
There are quite a few functions related to array handling in http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html
You can just search "array" to find them:
//Check if an expression is an array access (SgPntrArrRefExp). If so, return its name expression and subscripts if requested. Users can use convertRefToInitializedName() to get the possible name. It does not check if the expression is a top level SgPntrArrRefExp. SageInterface::isArrayReference (SgExpression *ref, SgExpression **arrayNameExp=NULL, std::vector< SgExpression * > **subscripts=NULL) // returns the array dimensions in an array as defined for arrtype std::vector< SgExpression * > SageInterface::get_C_array_dimensions (const SgArrayType &arrtype) // Get the number of dimensions of an array type. int SageInterface::getDimensionCount (SgType *t) // Get the element type of an array. SgType * SageInterface::getArrayElementType (SgType *t)
Some example code using these functions can be found in https://github.com/rose-compiler/rose-develop/blob/master/src/midend/programTransformation/ompLowering/omp_lowering.cpp
For example, void linearizeArrayAccess(SgPntrArrRefExp* top_array_ref) rewrites array reference using multiple-dimension subscripts to a reference using one-dimension subscripts:
- a[i][j] is changed to a[i*col_size +j]
- a [i][j][k] is changed to a [(i*col_size + j)*K_size +k]
Sample code to handle 1-D array references
For 1-D array element access a[0], the AST with 3 nodes looks like: a[0] // node 1: SgPntrArrRefExp / \ a 0 //node 3: SgIntVal | // node 2: SgVarRefExp So the code searching for SgVarRefExp will find a. The next step is to check its type. SgVarRefExp *vref = ... ROSE_ASSERT (vref != NULL); SgType* t = vref->get_type(); if (SgArrayType* atype= isSgArrayType(t)) // now you have array type { // obtain the dimension vector vector<SgExpression*> dimensions = SageInterface::get_C_array_dimensions (* atype); // dimensions.size() should be 1 if you only handle 1-D array types if (dimensions.size() ==1) { SgPntrArrRefExp * arr_ref_exp = vref->get_parent(); // now you get a[0] from a. //do your things you want , with a (vref) and a[o] (arr_ref_exp) } } else if (SageInterface::isScalarType(t))// if scalar types, handle them differently { ... }
How to add new AST nodes?
editThere is a section named "1.7 Adding New SAGE III IR Nodes (Developers Only)" in ROSE Developer’s Guide (http://www.rosecompiler.org/ROSE_DeveloperInstructions.pdf)
But before you decide adding new nodes, you may consider if AstAttribute (user defined objects attached to AST) would be sufficient for your problem.
For example, the 1st version of the OpenMP implementation in ROSE (rose/projects/OpenMP_Translator) started by using AstAttribute to represent information parsed from pragmas. Only in the 2nd version we introduced dedicated AST nodes.
There are two separate steps when new kinds of IR nodes are added into ROSE:
- First step (declaration): Adding class declaration/implementation into ROSE for the new IR nodes. This step is mostly related to ROSETTA.
- Second step (creation): Creating those new IR nodes at some point: such as somewhere within frontend, midend, or even backend if desired. So this step is decided case by case.
If the new types of IR come from their counterparts in EDG, then modifications to the EDG/SAGE connection code are needed. If not, the EDG/SAGE connection code may be irrelevant.
If you are trying to add new nodes to represent pragma information, you can create your new nodes without involving EDG or its connection to ROSE. You just parse the pragma string in the original AST and create your own nodes to get a new version of AST. Then it should be done.
How does the AST merge work?
edittests that demonstrate the AST Merge are in the directory:
tests/nonsmoke/functional/CompileTests/mergeAST_tests
(run "make check" to see hundreds of tests go by).
parent vs. scope
editAn AST node can have a parent node which is different from the its scope.
For example: the struct declaration's parent is the typedef declaration. But the struct's scope is the scope of the typedef declaration.
typedef struct frame {int x;} s_frame;
Parsing text into AST
editThere is some experimental support to parse simple code text into AST pieces. It is not intended to parse entire source codes. But the support should be able to be extended to handle more types of input.
Some documentation about this work:
- http://rosecompiler.org/ROSE_HTML_Reference/namespaceAstFromString.html
- http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf Chapter 33 Parser Building Blocks
Example project using the parser building blocks
- projects/pragmaParsing should work.
Translation
editHow to skip system headers in translation?
editOften we are only interested in user code. The AST represents all codes from users and system headers. We need to skip things from system headers.
// Final most complete version, skip all header files, we cannot unparse changed AST from header files , at least by default if (Inliner::skipHeaders) { string filename= funcall->get_file_info()->get_filename(); string suffix = StringUtility ::fileNameSuffix(filename); //vector.tcc: This is an internal header file, included by other library headers if (suffix=="h" ||suffix=="hpp"|| suffix=="hh"||suffix=="H" ||suffix=="hxx"||suffix=="h++" ||suffix=="tcc") return false; // also check if it is compiler generated, mostly template instantiations. They are not from user code. if (funcall->get_file_info()->isCompilerGenerated() ) return false; // check if the file is within include-staging/ header directories if (insideSystemHeader(funcall)) return false; } //------------partial solutions bool processStatements(SgNode* n) { ROSE_ASSERT (n!=NULL); // Skip compiler generated code, system headers, etc. if (isSgLocatedNode(n)) { if (isSgLocatedNode(n)->get_file_info()->isCompilerGenerated()) return false; } ... }
This is based on Sg_File_Info
Inside of Sg_File_Info::display(debug.......) isTransformation = false isCompilerGenerated = true (no position information) isOutputInCodeGeneration = false isShared = false isFrontendSpecific = true (part of ROSE support for gnu compatability) isSourcePositionUnavailableInFrontend = false isCommentOrDirective = false isToken = false file_id = 2 filename = /home/liao6/daily-test-rose/upcwork/install/include/gcc_HEADERS/rose_edg_required_macros_and_functions.h line = 167 column = 1 .... shared[1] int gsj; Inside of Sg_File_Info::display(debug.......) isTransformation = false isCompilerGenerated = false isOutputInCodeGeneration = false isShared = false isFrontendSpecific = false isSourcePositionUnavailableInFrontend = false isCommentOrDirective = false isToken = false filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc line = 6 column = 1 file_id = 1 filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc line = 6 column = 1
Another way, rose make a copy for all system headers and store them in dedicated paths
bool insideSystemHeader (SgLocatedNode* node) { bool rtval = false; ROSE_ASSERT (node != NULL); Sg_File_Info* finfo = node->get_file_info(); if (finfo!=NULL) { string fname = finfo->get_filenameString(); string buildtree_str1 = string("include-staging/gcc_HEADERS"); string buildtree_str2 = string("include-staging/g++_HEADERS"); string installtree_str1 = string("include/edg/gcc_HEADERS"); string installtree_str2 = string("include/edg/g++_HEADERS"); // if the file name has a sys header path of either source or build tree if ((fname.find (buildtree_str1, 0) != string::npos) || (fname.find (buildtree_str2, 0) != string::npos) || (fname.find (installtree_str1, 0) != string::npos) || (fname.find (installtree_str2, 0) != string::npos) ) rtval = true; } return rtval; }
Can ROSE identityTranslator generate 100% identical output file?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-January/000604.html
Questions: Rose identityTranslator performs some modifications, "automatically".
These modifications are:
- Expanding the assert macro.
- Adding extra brackets around constants of typedef types (e.g. c=Typedef_Example(12); is translated in the output to c = Typedef_Example((12));)
- Converting NULL to 0.
Can I avoid these modifications?
Answer: No.
There is no easy way to avoid these changes currently. Some of them are introduced by the cpp preprocessor. Others are introduced by the EDG front end ROSE uses. 100% faithful source-to-source translation may require significant changes to preprocessing directive handling and the EDG internals.
We have had some internal discussion to save raw token strings into AST and use them to get faithful unparsed code. But this effort is still at its initial stage as far as I know.
How to build a tool inserting function calls?
edithttps://mailman.nersc.gov/pipermail/rose-public/2010-July/000319.html
Question: I am trying to build a tool which insert one or more function calls whenever in the source code there is a function belonging to a certain group (e.g. all functions beginning with foo_*). During the ast traversal, how can I find the right place, i.e., there is a function in ROSE that searches for a string pattern or something similar?
Answers:
- In Chapter 28 AST Construction of the ROSE tutorial, there are examples to instrument function calls into the AST using traversals or a queryTree. I would approach this by checking the node for the specific SgFunctionDefinition (or whatever you need) and then check the name of the node to find its location.
- You can
- use the AST query mechanism to find all functions and store them in a container. e.g Rose_STL_Container<SgNode*> nodeList = NodeQuery::querySubTree(root_node,V_Sg????);
- Then iterate the container to check each function to see if the function name matches what you want.
- use SageBuilder namespace's buildFunctionCallStmt() to create a function call statement.
- use SageInterface namespace's insertStatement () to do the insertion.
How to insert a header into an input file?
editThere is an SageInterface function for doing this:
// Insert include "filename" or include <filename> (system header) into the global scope containing the current scope, right after other include XXX. PreprocessingInfo * SageInterface::insertHeader (const std::string &filename, PreprocessingInfo::RelativePositionType position=PreprocessingInfo::after, bool isSystemHeader=false, SgScopeStatement *scope=NULL)
How to copy/clone a function?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-April/000919.html
We need to be more specific about the function you want to copy. Is it just a prototype function declaration (non-defining declaration in ROSE's term ) or a function with a definition (defining declaration in ROSE's term)?
- Copying a non-defining function declaration can be achieved by using the following function:
// Build a prototype for an existing function declaration (defining or nondefining is fine). SgFunctionDeclaration* SageBuilder::buildNondefiningFunctionDeclaration (const SgFunctionDeclaration *funcdecl, SgScopeStatement *scope=NULL)
- Copying a defining function declaration is semantically a problem since it introduces redefinition of the same function.
It is at least a hack to first introduce something wrong and later correct it. Here is an example translator to do the hack (copy a defining function, rename it, fix its symbol):
#include <rose.h> #include <stdio.h> using namespace SageInterface; int main(int argc, char** argv) { SgProject* project = frontend(argc, argv); AstTests::runAllTests(project); // Find a defining function named "bar" under project SgFunctionDeclaration* func= findDeclarationStatement<SgFunctionDeclaration> (project, "bar", NULL, true); ROSE_ASSERT (func != NULL); // Make a copy and set it to a new name SgFunctionDeclaration* func_copy = isSgFunctionDeclaration(copyStatement (func)); func_copy->set_name("bar_copy"); // Insert it to a scope SgGlobal * glb = getFirstGlobalScope(project); appendStatement (func_copy,glb); #if 0 // fix up the missing symbol, this should be optional now since SageInterface::appendStatement() should handle it transparently. SgFunctionSymbol *func_symbol = glb->lookup_function_symbol ("bar_copy", func_copy->get_type()); if (func_symbol == NULL); { func_symbol = new SgFunctionSymbol (func_copy); glb ->insert_symbol("bar_copy", func_symbol); } #endif AstTests::runAllTests(project); backend(project); return 0; }
- Another thing to consider is if you want to copy a function into another file. You have to change the clone's file location information.
ROSE's unparser checks for Sg_File_Info objects of AST pieces before it decides to print out text format of the AST pieces. Only the AST coming from the same file of the input file or AST generated by transformation should be unparsed by default. For example, some AST subtrees come from an included header. But it is often not desired to unparse the content of an included header.
If the file info is still the original file info, the solution is to set the copied AST to be transformation-generated:
// Recursively set source position info(Sg_File_Info) as transformation generated. SageInterface::setSourcePositionForTransformation (SgNode *root)
Can I transform code within a header file?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-May/000971.html
No. ROSE does not unparse AST from headers right now. A summer project tried to do this. But it did not finish and not well tested.
The option is -rose:unparseHeaderFiles -rose:unparseHeaderFilesRootFolder UNPARSED_HEADERS_DIR in tests/CompilerTests/UnparseHeadersTests
https://mailman.nersc.gov/pipermail/rose-public/2010-August/000344.html
I guess ROSE does not support writing out changed headers for safety/practical reasons. A changed header has to be saved to another file since writing to the original header is very dangerous (imaging debugging a header translator which corrupts input headers). Then all other files/headers using the changed header have to be updated to use the new header file.
Also all files involved have to be writable by user's translators.
As a result, the current unparser skips subtrees of AST from headers by checking file flags (compiler_generated and/or output_in_code_generation etc.) stored in Sg_File_Info objects.
How to work with formal and actual arguments of functions?
edithttps://mailman.nersc.gov/pipermail/rose-public/2011-June/001008.html
//Get the actual arguments SgExprListExp* actualArguments = NULL; if (isSgFunctionCallExp(callSite)) actualArguments = isSgFunctionCallExp(callSite)->get_args(); else if (isSgConstructorInitializer(callSite)) actualArguments = isSgConstructorInitializer(callSite)->get_args(); ROSE_ASSERT(actualArguments != NULL); const SgExpressionPtrList& actualArgList = actualArguments->get_expressions(); //Get the formal arguments. SgInitializedNamePtrList formalArgList; if (calleeDef != NULL) formalArgList = calleeDef->get_declaration()->get_args(); //The number of actual arguments can be less than the number of formal arguments (with implicit arguments) or greater //than the number of formal arguments (with varargs)
How to translate multiple files scattered in different directories of a project?
editExpected behavior of a ROSE Translator:
A translator built using ROSE is designed to act like a compiler (gcc, g++,gfortran ,etc depending on the input file types). So users of the translator only need to change the build system for the input files to use the translator instead of the original compiler.
If the original compiler used by you implicitly include or link anything, you may have to make the include or linking paths explicit after the change. For example, if mpiCC transparently links to /path/to/mpilib.a, you have to add this linking flag into your modified Makefile.
On 07/25/2012 11:20 AM, Fernando Rannou wrote: > > Hello > > > > We are trying to use ROSE to refactor a big project consisting of > > several *.cc and *.hh files, located at various directories. Each > > class is defined in a *.hh file and implemented in a *.cc file. > > Classes include (#include) other class definitions. But we have only > > found single file examples. > > > > Is this possible? If so, how? > > > > > > Thanks
Unparsing
editGenerate code into different files
edithttps://mailman.nersc.gov/pipermail/rose-public/2012-August/001742.html Question: I wonder is it possible for ROSE to generate two files (.c and .cl) when it translates C-to-OpenCL ?
Answer: The ROSE outliner has an option to output the generated function into a new file.
... // Generate the outlined function into a separated new source file // -rose:outline:new_file extern bool useNewFile; ...
You may want to check how this option is used in the outliner source files to get what you want.
Binary Analysis
editHow is the binary analysis capability in ROSE?
editQuestion: how is the binary analysis capability in ROSE? Is it just disassembly? is it possible to associate the binary code with the source if combined with ROSE source code analysis?
Answer:
ROSE has various binary disassemblers (x86, ARM, MIPS, PowerPC) that, like source code analysis, create an internal representation of the binary in the form of an AST. Although the types of AST nodes for source and binaries are largely disjoint, one can analyze the binary AST using concepts similar to source analysis. ROSE has a few binary analyses. Here are some off the top of my head:
- Control flow graphs, both virtual and using Boost Graph Library.
- Function call graphs.
- Operations on control flow graphs: dominator, post-dominator
- Pointer detection analysis that tries to figure out which memory locations are used as if they were pointers in a higher level language.
- Instruction partitioning: figuring out how to group instructions into basic blocks, and how to group basic blocks into functions when all you have is a list of instructions. Its accuracy on automatically partitioning stripped, obfuscated code has been shown to be better than the best disassemblers that use debugging info and symbol tables.
- Instruction semantics for x86. This is an area of active development but supports only 32-bit integer instructions. We plan to add floating point, SIMD, 64-bit, other architectures, and a simpler API. But even as it stands, it is complete enough to simulate entire ELF executables (even "vi"). See next bullet
- An x86 simulator for ELF executables. This project is able to simulate how the Linux kernel loads an executable, and the various system calls made by the executable. It it complete enough to simulate many Linux programs, but also provides callback points for the user to insert various kinds of analyses. For instance, you could use it to disassemble an entire process after it has been dynamically linked. There are many examples in the projects/simulator directory. In contrast to simulators like Qemu, Bochs, valgrind, VirtualBox, VMware, etc. where speed is a primary design driver, the ROSE simulator is designed to provide user-level access to as many aspects of execution as possible.
- Plugins for instruction semantics. Instruction semantics is written in such a way that different "semantic domains" can be plugged in. ROSE has a symbolic domain, an interval domain, and a partial-symbolic domain. The symbolic domain can be used in conjunction with an SMT solver (currently supporting Yices). The interval domain is actually sets of intervals, and is binary-arithmetic-aware (i.e., correctly handles overflows, etc on a fixed word size). The partial-symbolic domain uses single-node expressions in order to optimize for speed and size at the expense of accuracy. Users can and have written other domains, and a new API (in the works) will make this even easier.
- Examples of data-flow analysis (e.g., the pointer analysis already mentioned), but not a well defined framework yet (someone is working on one). Currently, data-flow type analyses are implemented using the instruction semantics support: as each instruction is "executed" the domain in which it executes causes the data to flow in the machine state. Each analysis provides its own flow equation to handle the points where control flow joins from two or more directions; and provides its own "next-instruction" function to iterate over the control flow graph.
- Clone detection of various formats: various forms of syntactic, including one using locality-sensitive hashing; and semantic clone detection via fuzz testing in a simulator.
By Robb
You ask about combining source and binary analysis... Its certainly
possible since ROSE can hold both the binary and source ASTs in memory
at the same time. But I'm not aware of any analysis that "sews" them
together. We do support parsing DWARF info from ELF executables, so
you might be able to use that to sew the two ASTs together.
--Robb
Daily work
editgit clone returns error: SSL certificate problem?
editSymptom:
git clone https://github.com/rose-compiler/rose.git Cloning into rose... error: SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed while accessing https://github.com/rose-compiler/rose.git/info/refs fatal: HTTP request failed
The reason may be that you are behind a firewall which tweaks the original SSL certification.
Solutions: Tell cURL to not check for SSL certificates:
#Solution 1: Environment variable (temporary) $ env GIT_SSL_NO_VERIFY=true git pull # Solution 2: git-config (permanent) # set local configuration $ git config --local http.sslVerify false # Solution 2: set global configuration $ git config --global http.sslVerify false
What is the best IDE for ROSE developers?
edithttps://mailman.nersc.gov/pipermail/rose-public/2010-April/000115.html
There may not be a widely recognized best integrated development environment. But developers have reported that they are using
- vim
- emacs
- KDevelop
- Source Navigator
- Eclipse
- Netbeans
The thing is that ROSE is huge and has some ridiculously large generated source file (CxxGrammar.h and CxxGrammar.C are generated in the build tree for example). So many code browsers may have trouble in handling ROSE.
Portability
editWhat is the status for supporting Windows?
editWe do maintain some preliminary Windows Support of building ROSE/src to generate librose.so by leveraging cmake. However, the work is not finished.
To build librose under windows, type the following command lines in the top level source tree
mkdir ROSE-build-cmake cd ROSE-build-cmake cmake .. -DBOOST_ROOT=${ROSE_TEST_BOOST_PATH} // Example: boost installation path /opt/boost_1_40_0-inst
https://mailman.nersc.gov/pipermail/rose-public/2011-December/001349.html
We have not finished the Windows work yet. IT is on our list of things to do. It was started and ROSE internally compiles using MS Visual Studio (using project files generated from the Cmake build that we maintain and test within our release process for ROSE) but does not pass our tests. So it is not ready. The distribution of the EDG binaries for Windows is another step that would come after that. We don't know at present when this will be done, it is important, but not a high priority for our DOE specific work, but important for other work. The effort required is something that we could discuss. If you want to call me that would be the best way to proceed. Send me email off of the main list and we can set that up.
https://mailman.nersc.gov/pipermail/rose-public/2011-March/000798.html
Under Windows ROSE uses CMake. This is a project that is currently under development. As of November 2010 we are able to compile and link the src directory. We are also able to run example programs that link against librose and execute the frontend and backend. {\em However, this is an internal capability and not available externally yet since we don't distribute the Windows generated EDG binaries that would be required. Also the current support for Windows is still incomplete, ROSE does not yet pass its internal tests under Windows.}