ROSE Compiler Framework/How-tos

Quick, short, and focused tutorials about how to do common tasks as a ROSE developer.

Please create a new wikibook page for each how-to topic. Each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

How to write a How-to

Quick, short, and focused tutorials about how to do common tasks as a ROSE developer. Please create a new wikibook page for each how-to topic. Each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

Create a new page

optional step: create an account and log in
Goto: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/How-tos
Click on Edit tab on the right top of the How-tos page
Copy and paste one existing How-to to the end of the page, for example:

==[[ROSE Compiler Framework/How to write a How-to|How to write a How-to]]==
{{:ROSE Compiler Framework/How to write a How-to}}

rename three places of the pasted text with the desired page name, for example

==[[ROSE Compiler Framework/How to do XYZ|How to do XYZ]]==
{{:ROSE Compiler Framework/How to do XYZ}}

click save page
You will see red text trying to link to the not yet existing How to do XYZ page
click any of the red text, it will bring you to an editing window to add content of your new how-to page
you can now add new content and save it.
- Again, each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

Insert image to wiki page

To use your own image in wiki page, you have to upload the image to http://commons.wikimedia.org/.
Once you upload the image, it will become public to all wikibooks users. Be sure to declare your copyright if the image is created by yourself.
Following this instruction to insert image and adjust the layout of your page: http://en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images

Rules of the content

Only level three headings (===) and higher are allowed in a how-to page. This is necessary for the how-to page to be correctly included into the final one-page print version of this wikibook. Sorry about this restriction.
- Again, please don't use level one (=) or level two (==) headings in a how-to page!
Keep each how-to short and focused. Readers are expected to only spend 30-minutes or much less to quickly learn how to do something using ROSE.
After you created a new how-to page and saved your contributions. Please go to the print version to make sure it shows up correctly.
- Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/Print_version
- Having new content show up in the print version will make sure it is really visible and consistent with the rest of the book.
please specify the how-to topic is the current practice or the proposed new ways of doing things. So we can have clear guideline for code review for what is mandatory and what is optional.

How to incrementally work on a project

Developing a big, sophisticated project entails many challenges. To mitigate some of these challenges, we have adopted several best practices: incremental development, code review, and continuous integration.

Here are some tips on how to divide up a big project into smaller, bite-sized pieces so each piece can be incrementally developed, code reviewed, and integrated.

Input: define different sets of test inputs based on complexity and difficulty. Tackle simpler sets first.
Output: define intermediate results leading to the final output. Often, results A and B are needed to generate C. So the project can have multiple stages, based on the intermediate results.
Algorithm: complex compiler algorithms are often just enhanced versions of more fundamental algorithms. Implement the fundamental algorithms first to gain insight and experience. Then, afterward, you can implement the full-blown versions.
Language: for projects dealing with multiple languages, focus on one language at a time.
Platform: limit the scope of supported platforms: Linux, Ubuntu, OS X (TODO: add reference to ROSE supported platforms)
Performance: Start with a basic, working implementation first. Then try to optimize its performance, efficiency.
Scope: your translator could first focus on working at a function scope, then grow to handle an entire source file, or even multiple files, at the same time.
Skeleton then meat: a project should be created with the major components defined first. Each component can be enriched separately later on.
Annotations (manual vs. automated): Performing one compiler task often requires results from many other tasks being developed. Defining source code annotations as the interface between two tasks can decouple these dependencies in a clean manner. The annotations can be first manually inserted. Later the annotations can be automatically generated by the finished analysis.
Optional vs. Default: introducing a flag to turn on/off your feature. Make it as a default option when it matures.

How to visualize AST

Overview

Three things are needed to visualize ROSE AST:

Sample input code: you provide it
a dot graph generator to generate a dot file from AST: ROSE provides dot graph generators
a visualization tool to open the dot graph: ZGRViewer and Graphviz are used by ROSE developers

If you don't want to install ROSE+ZGRview + Graphvis from scratch, you can directly use ROSE virtual machine image, which has everything you need installed and configured so you can just visualize your sample code.

Sample input code

Please prepare simplest input code without including any headers so you can get a small enough AST to digest.

Dot Graph Generator

We provide ROSE_INSTALLATION_TREE/bin/dotGeneratorWholeASTGraph (complex graph) and dotGenerator (a simpler version) to generate a dot graph of the detailed AST of input code.

Tools to generate AST graph in dot format. There are two versions

dotGenerator: simple AST graph generator showing essential nodes and edges
dotGeneratorWholeASTGraph: whole AST graph showing more details. It provides filter options to show/hide certain AST information.

command line:

 dotGeneratorWholeASTGraph  yourcode.c  // it is best to avoid including any header into your input code to have a small enough tree to visualize.

To skip builtin functions

dotGeneratorWholeASTGraph -DSKIP_ROSE_BUILTIN_DECLARATIONS yourcode.c

dotGeneratorWholeASTGraph -rose:help
   -rose:help                     show this help message
   -rose:dotgraph:asmFileFormatFilter           [0|1]  Disable or enable asmFileFormat filter
   -rose:dotgraph:asmTypeFilter                 [0|1]  Disable or enable asmType filter
   -rose:dotgraph:binaryExecutableFormatFilter  [0|1]  Disable or enable binaryExecutableFormat filter
   -rose:dotgraph:commentAndDirectiveFilter     [0|1]  Disable or enable commentAndDirective filter
   -rose:dotgraph:ctorInitializerListFilter     [0|1]  Disable or enable ctorInitializerList filter
   -rose:dotgraph:defaultFilter                 [0|1]  Disable or enable default filter
   -rose:dotgraph:defaultColorFilter            [0|1]  Disable or enable defaultColor filter
   -rose:dotgraph:edgeFilter                    [0|1]  Disable or enable edge filter
   -rose:dotgraph:expressionFilter              [0|1]  Disable or enable expression filter
   -rose:dotgraph:fileInfoFilter                [0|1]  Disable or enable fileInfo filter
   -rose:dotgraph:frontendCompatibilityFilter   [0|1]  Disable or enable frontendCompatibility filter
   -rose:dotgraph:symbolFilter                  [0|1]  Disable or enable symbol filter
   -rose:dotgraph:emptySymbolTableFilter        [0|1]  Disable or enable emptySymbolTable filter
   -rose:dotgraph:typeFilter                    [0|1]  Disable or enable type filter
   -rose:dotgraph:variableDeclarationFilter     [0|1]  Disable or enable variableDeclaration filter
   -rose:dotgraph:variableDefinitionFilter      [0|1]  Disable or enable variableDefinitionFilter filter
   -rose:dotgraph:noFilter                      [0|1]  Disable or enable no filtering
Current filter flags' values are: 
         m_asmFileFormat = 0 
         m_asmType = 0 
         m_binaryExecutableFormat = 0 
         m_commentAndDirective = 1 
         m_ctorInitializer = 0 
         m_default = 1 
         m_defaultColor = 1 
         m_edge = 1 
         m_emptySymbolTable = 0 
         m_expression = 0 
         m_fileInfo = 1 
         m_frontendCompatibility = 0 
         m_symbol = 0 
         m_type = 0 
         m_variableDeclaration = 0 
         m_variableDefinition = 0 
         m_noFilter = 0

Dot Graph Visualization

To visualize the generated dot graph, you have to install

Graphviz: http://www.graphviz.org/Download.php .
ZGRViewer: http://zvtm.sourceforge.net/zgrviewer.html#download. (Version 0.8.x is recommended since 0.9.x has a bugs like inversed (reversed) direction to drag a graph around.)

Please note that you have to configure ZGRViewer to have correct paths to some commands it uses. You can do it from its configuration/setting menu item. Or directly modify the text configuration file (.zgrviewer).

One example configuration is shown below (cat .zgrviewer)

<?xml version="1.0" encoding="UTF-8"?>
<zgrv:config xmlns:zgrv="http://zvtm.sourceforge.net/zgrviewer">
    <zgrv:directories>
        <zgrv:tmpDir value="true">/tmp</zgrv:tmpDir>
        <zgrv:graphDir>/home/liao6/svnrepos</zgrv:graphDir>
        <zgrv:dot>/home/liao6/opt/graphviz-2.18/bin/dot</zgrv:dot>
        <zgrv:neato>/home/liao6/opt/graphviz-2.18/bin/neato</zgrv:neato>
        <zgrv:circo>/home/liao6/opt/graphviz-2.18/bin/circo</zgrv:circo>
        <zgrv:twopi>/home/liao6/opt/graphviz-2.18/bin/twopi</zgrv:twopi>
        <zgrv:graphvizFontDir>/home/liao6/opt/graphviz-2.18/bin</zgrv:graphvizFontDir>
    </zgrv:directories>
    <zgrv:webBrowser autoDetect="true" options="" path=""/>
    <zgrv:proxy enable="false" host="" port="80"/>
    <zgrv:preferences antialiasing="false" cmdL_options=""
        highlightColor="-65536" magFactor="2.0" saveWindowLayout="false"
        sdZoom="false" sdZoomFactor="2" silent="true"/>
    <zgrv:plugins/>
    <zgrv:commandLines/>
</zgrv:config>

You have to configure the run.sh script to have correct path also

cat run.sh

#!/bin/sh

# If you want to be able to run ZGRViewer from any directory,
# set ZGRV_HOME to the absolute path of ZGRViewer's main directory
# e.g. ZGRV_HOME=/usr/local/zgrviewer

ZGRV_HOME=/home/liao6/opt/zgrviewer-0.8.1

java -jar $ZGRV_HOME/target/zgrviewer-0.8.1.jar "$@"

Example session

A complete example

# make sure the environment variables(PATH, LD_LIBRARY_PATH) for the installed rose are correctly set
which dotGeneratorWholeASTGraph
~/workspace/masterClean/build64/install/bin/dotGeneratorWholeASTGraph

# run the dot graph generator
dotGeneratorWholeASTGraph -c ttt.c

#see it
which run.sh
~/64home/opt/zgrviewer-0.8.2/run.sh

run.sh ttt.c_WholeAST.dot

example output

We put some example source files and their AST dump files into: https://github.com/chunhualiao/rose-ast

Print AST as horizontal tree

SageInterface functions


// You can call the following functions with gdb

   //! Pretty print AST horizontally, output to std output
   void SageInterface::printAST (SgNode* node); 


   //! Pretty print AST horizontally, output to a specified text file
   void SageInterface::printAST (SgNode* node, const char* filename); 

   //! Pretty print AST horizontally, output to a specified text file.
   void SageInterface::printAST2TextFile (SgNode* node, const char* filename, bool printTypes=true);

A translator (textASTGenerator) is also available, with its source code under exampleTranslators/defaultTranslator .

make install-tools will install this tool
textASTGenerator input.c will generate a text output of the entire AST

Example use inside of gdb

to print a portion of AST to the screen
to print a portion of AST into a text file

(gdb) up
#7  0x00007ffff418ab5d in Unparse_ExprStmt::unparseExprStmt (this=0x1a1bf950, stmt=0x7fffda63ce30, info=...) at ../../../sourcetree/src/backend/unparser/CxxCodeGeneration/unparseCxx_statements.C:9889

(gdb) p SageInterface::printAST(stmt)
└──@0x7fffda63ce30 SgExprStatement transformation 0:0
    └──@0x7fffd8488790 SgFunctionCallExp transformation 0:0
        ├──@0x7fffe6211910 SgMemberFunctionRefExp transformation 0:0
        └──@0x7fffd7f2c370 SgExprListExp transformation 0:0
            └──@0x7fffd8488720 SgFunctionCallExp transformation 0:0
                ├──@0x7fffe6211988 SgMemberFunctionRefExp transformation 0:0
                └──@0x7fffd7f2c3d8 SgExprListExp transformation 0:0
$2 = void


(gdb) up 10
#48 0x00007ffff40dce69 in Unparser::unparseFile (this=0x7fffffff8c60, file=0x7fffeb786010, info=..., unparseScope=0x0) at ../../../sourcetree/src/backend/unparser/unparser.C:945
(gdb) p SageInterface::printAST2TextFile(file,"test.txt")

textASTGenerator

Example command line use:

textASTGenerator -c test_qualifiedName.cpp

cat test_qualifiedName.cpp.AST.txt

└──@0x7fe9f1916010 SgProject
    └──@0xb45730 SgFileList
        └──@0x7fe9f17be010 SgSourceFile
            ├──@0x7fe9fdf19120 SgGlobal test_qualifiedName.cpp 0:0
            │   ├──@0x7fe9f159a010 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0
            │   │   └── NULL
            │   ├──@0x7fe9f159a390 SgTypedefDeclaration rose_edg_required_macros_and_functions.h 0:0
            │   │   └── NULL
            │   ├──@0x7fe9f0f59010 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::feclearexcept"
            │   │   ├──@0x7fe9f1391010 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0
            │   │   │   └──@0x7fe9f1258010 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts"
            │   │   │       └── NULL
            │   │   ├── NULL
            │   │   └── NULL
            │   ├──@0x7fe9f0f59540 SgFunctionDeclaration rose_edg_required_macros_and_functions.h 0:0 "::fegetexceptflag"
            │   │   ├──@0x7fe9f1391630 SgFunctionParameterList rose_edg_required_macros_and_functions.h 0:0
            │   │   │   ├──@0x7fe9f1258420 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__flagp"
            │   │   │   │   └── NULL
            │   │   │   └──@0x7fe9f1258628 SgInitializedName rose_edg_required_macros_and_functions.h 0:0 "::__excepts"
            │   │   │       └── NULL
            │   │   ├── NULL
            │   │   └── NULL

              ...

            │   └──@0x7fe9eff218c0 SgFunctionDeclaration test_qualifiedName.cpp 14:1 "::foo"
            │       ├──@0x7fe9ef5e0320 SgFunctionParameterList test_qualifiedName.cpp 14:1
            │       │   ├──@0x7fe9ef495278 SgInitializedName test_qualifiedName.cpp 14:13 "x"
            │       │   │   └── NULL
            │       │   └──@0x7fe9ef495480 SgInitializedName test_qualifiedName.cpp 14:20 "y"
            │       │       └── NULL
            │       ├── NULL
            │       └──@0x7fe9ee8f3010 SgFunctionDefinition test_qualifiedName.cpp 15:1
            │           └──@0x7fe9ee988010 SgBasicBlock test_qualifiedName.cpp 15:1
            │               ├──@0x7fe9eee1ba90 SgVariableDeclaration test_qualifiedName.cpp 16:3
            │               │   ├── NULL
            │               │   └──@0x7fe9ef495688 SgInitializedName test_qualifiedName.cpp 16:3 "z"
            │               │       └── NULL
            │               ├──@0x7fe9ee7ad010 SgExprStatement test_qualifiedName.cpp 17:3
            │               │   └──@0x7fe9ee7dc010 SgAssignOp test_qualifiedName.cpp 17:5
            │               │       ├──@0x7fe9ee8c0010 SgVarRefExp test_qualifiedName.cpp 17:3
            │               │       └──@0x7fe9ee813010 SgAddOp test_qualifiedName.cpp 17:9
            │               │           ├──@0x7fe9ee8c0078 SgVarRefExp test_qualifiedName.cpp 17:7
            │               │           └──@0x7fe9ee84a010 SgMultiplyOp test_qualifiedName.cpp 17:12
            │               │               ├──@0x7fe9ee8c00e0 SgVarRefExp test_qualifiedName.cpp 17:11
            │               │               └──@0x7fe9ee881010 SgIntVal test_qualifiedName.cpp 17:13
            │               └──@0x7fe9ee77e010 SgReturnStmt test_qualifiedName.cpp 18:3
            │                   └──@0x7fe9ee8c0148 SgVarRefExp test_qualifiedName.cpp 18:10
            ├── NULL
            ├── NULL
            └── NULL

Render the AST in HTML

The repo errington1/ast-to-html contains a tool for rendering the Rose abstract syntax "graph" as collapsible HTML with shared nodes and cycles represented by HTML links. For now, it's available only from the command line. The plan is to add command-line options to omit parts of the tree and to make the tool available as a library. For now, it somewhat arbitrarily omit portions of the tree that originate from the file rose_edg_required_macros_and_functions.h.

The command:

astToHTML file.C

will produce file.C.html which can be viewed with a browser:

firefox file.C.html

How to create a translator

Translator basically converts one AST to another version of AST. The translation process may add, delete, or modify the information stored in AST.

Overview

A ROSE-based translator usually has the following steps

Search for the AST nodes you want to translate.
Perform the translation action on the found AST nodes. This action can be one of two major variants

Updating the existing AST nodes
Creating new AST nodes to replace the original ones. This is usually cleaner approach than patching up existing AST and is better supported by SageBuilder and SageInterface functions.
Deep copying existing AST subtrees to duplicate the code. May expression subtrees should not be shared. So deep copy them is required to get the correct AST.
Optionally update other related information for the translation.

First Step

Get familiar with the ASTs before and after your translation. So you know for sure what your code will deal with and what AST you code will generate.

The best way is to prepare simplest sample codes and carefully examine the whole dot graphs of them.

More details for visualize AST are available at How to visualize AST.

Design considerations

It is usually a good idea to

separate the searching step from the translation step so one search (traversal) can be reused by all sorts of translations.
When design the order of searching and translation, be careful about if the translation will negatively impact on the searching
- Please void pre-order traversal since you may end up modifying AST nodes to be visited later on, similar to the effect of iterator invalidation.
- please use post-order, or reverse order of pre-order for your traversal hooked up with translation

Searching for the AST node

There are multiple ways to find things you want to translate in AST.

AST Query

Via AST Query: Node query returns a list of AST nodes in the same type. This is often enough to simple translations

Rose_STL_Container<SgNode*> ProgramHeaderStatementList = NodeQuery::querySubTree (project,V_SgProgramHeaderStatement);
for (Rose_STL_Container<SgNode*>::iterator i = ProgramHeaderStatementList.begin(); i != ProgramHeaderStatementList.end(); i++)
{
    SgProgramHeaderStatement* ProgramHeaderStatement = isSgProgramHeaderStatement(*i);
    ...
}

More information about AST Query can be found at "6 Query Library" of the ROSE User Manual pdf.

AST Traversal

Through AST traversal: walks through whole AST using different orders (pre-order or post order). Post-order traversal is recommended to avoid modifying things the traversal will hit later on (similar problem as iterator invalidation in C++)
- The AST traversal gives visit() functions to hook up your translation functions. A switch statement is can be used for handling different types of AST node.

class f2cTraversal : public AstSimpleProcessing
{
  public:
    virtual void visit(SgNode* n);
};

void f2cTraversal::visit(SgNode* n)
{
  switch(n->variantT())
  {
    case V_SgSourceFile:
      {
        SgFile* fileNode = isSgFile(n);
        translateFileName(fileNode);
      }
      break;
    case V_SgProgramHeaderStatement:
      {
        ...
      }
      break;
    default:
      break;
  }
}

More information about AST Traversal can be found at "7 AST Traversal" of the ROSE User manual pdf online.

Performing Translation

Before you write your translator, please read Chapter 32 AST Construction of ROSE tutorial pdf documentation (http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf). It contains essential information for any translation writers.

The translations you want to do often depend on the types of the AST nodes you visit. For example you can have a set of translation functions defined in your namespace

void translateForLoop(SgForLoop* n)
void translateFileName(SgFile* n)
void translateReturnStatement(SgReturnStmt* n), and so on

Other tips

Reference ROSE doxygen website for information of each AST node: http://rosecompiler.org/ROSE_HTML_Reference/index.html
Use SageBuilder namespace (http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageBuilder.html) if you want to create new AST node. Update SageBuilder you cannot find the one you need.
Look up in SageInterface Namespace (http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html) for the translation functions you need. If there is none, then write your own function.
Besides building things from scratch, you can use SageInterface::deepCopy() to copy AST subtree.
Update the information, or create the new AST node you need.
Replace the existing AST node with your updated or new AST node.

Updating Tree

You might need to handle some details, like removing symbol, updating parent, and symbol table.
Be careful to use deepDelete() and deepCopy(). Some information might not be updated properly. For example, deepDelete might not update your symbol table.

Verify the correctness

You can use wholeAST graph to verify your translation.

All ROSE-based translators should call AstTests::runAllTests(project) after all the transformation is done to make sure the translated AST is correct.

This has a higher standard than just correctly unparsed to compilable code. It is common for an AST to go through unparsing correctly but fail on the sanity check.

More information is at Sanity_check

Sample translators

Here we list a few sample translators which can grow to more sophisticated ones you want.

Find pragmas

/*
toy code
by Liao, 12/14/2007
*/
#include "rose.h"
#include <iostream>
using namespace std;

class visitorTraversal : public AstSimpleProcessing
{
  protected:
    virtual void visit(SgNode* n);
};

void visitorTraversal::visit(SgNode* node)
{
  if (node->variantT() == V_SgPragmaDeclaration) {
      cout << "pragma!" << endl;
  }
}

int main(int argc, char * argv[])
{
  SgProject *project = frontend (argc, argv);
  visitorTraversal myvisitor;
  myvisitor.traverseInputFiles(project,preorder);

  return backend(project);
}

Here is an example project doing pragma parsing and saving the results into AST attributes.

https://github.com/rose-compiler/rose-develop/tree/master/projects/pragmaParsing

Loop transformation

SageInterface namespace (http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html) has many translation functions, such as those for loops.

For example, there is a loop tiling function defined in https://github.com/rose-compiler/rose/blob/master/src/frontend/SageIII/sageInterface/sageInterface.C :

//     Tile the n-level (starting from 1) loop of a perfectly nested loop nest using tiling size s.
bool     loopTiling (SgForStatement *loopNest, size_t targetLevel, size_t tileSize)

An example Test translator is provided to test this function:

https://github.com/rose-compiler/rose/blob/master/tests/roseTests/astInterfaceTests/loopTiling.C

And it has a test input file:

https://github.com/rose-compiler/rose/blob/master/tests/roseTests/astInterfaceTests/inputloopTiling.C

How to build your translator

See How to set up the makefile for a translator

How to create a cross-language translator

In this HOW-to, it presents the steps of generating a cross-language translator. We will use Fortran to C translator as an example here.

Change the sourcefile information

change the output file name. The suffix name has to be changed with this following function.

void SgFile::set_unparse_output_filename (std::string unparse_output_filename )

change the output language type.

void SgFile::set_outputLanguage(SgFile::outputLanguageOption_enum outputLanguage)

Set the output to be target-language only.

 We use set_C_only for the Fortran to C translation.  This process might be optional.

void SgFile::set_C_only(bool C_only)

Identify language-dependent AST node

Example: ROSE AST uses different AST nodes to present a loop in C and Fortran. The following two figures represent the same loop for different languages.

C uses SgForStatement for the for loops.

C SgForStatement

Fortran uses SgFortranDo for the do loops.

Fortran SgFortranDo

Implement the translation functions

Use the wholeAST as reference to implement the translation function.
Generate the new AST node by copy required information from the original AST node.
Remove the original node, and make sure the parent/child relationship in AST is setup properly.

Testing output code

If compiler is available to test the output code, run the backend to generate object by the backend compiler.
If compiler is not available for the target language, make sure output codes can be generated from the testing cases. It is suggested to run the compilation tests for all the testing output.

How to set up the makefile for a translator

In this How-to, you will create a makefile to compile and test your own custom ROSE translator.

You may want to first look at "How-to install ROSE": ROSE Compiler Framework/Installation.

Environment variables

You must have the proper environment variable set so you translator can find the librose.so during execution.

export LD_LIBRARY_PATH=${ROSE_INSTALL}/lib:${BOOST_INSTALL}/lib:$LD_LIBRARY_PATH

Translator Code

Here is a simplest ROSE translator.

// ROSE translator example: identity translator.
//
// No AST manipulations, just a simple translation:
//
//    input_code > ROSE AST > output_code

#include <rose.h>

int main (int argc, char** argv)
{
    // Build the AST used by ROSE
    SgProject* project = frontend(argc, argv);

    // Run internal consistency tests on AST
    AstTests::runAllTests(project);

    // Insert your own manipulations of the AST here...

    // Generate source code from AST and invoke your
    // desired backend compiler
    return backend(project);
}

Example 1

If you have a project that's separate from ROSE (i.e., you compile it with an *installed* version of ROSE) it's up to you how to do things.

If the project depends only on ROSE and ROSE's dependencies then you can use the Makefile described at the end of the ROSE installation instructions http://rosecompiler.org/ROSE_HTML_Reference/installation.html

# Sample makefile for programs that use the ROSE library.
#
# ROSE has a number of configuration details that must be present when
# compiling and linking a user program with ROSE, and some of these 
# details are difficult to get right.  The most foolproof way to get
# these details into your own makefile is to use the "rose-config"
# tool. 
#
#
# This makefile assumes:
#   1. The ROSE library has been properly installed (refer to the
#      documentation for configuring, building, and installing ROSE).
#
#   2. The top of the installation directory is $(ROSE_INSTALLED). This
#      is the same directory you specified for the "--prefix" argument
#      of the "configure" script, or the "CMAKE_INSTALL_PREFIX" if using 
#      cmake. E.g., "/usr/local".
#
# The "rose-config" tool currently only works for ROSE configured with
# GNU auto tools (e.g., you ran "configure" when you built and
# installed ROSE). The "cmake" configuration is not currently
# supported by "rose-config" [September 2015].
##############################################################################

# Standard C++ compiler stuff (see rose-config --help)
CXX      = $(shell $(ROSE_INSTALLED)/bin/rose-config cxx)
CPPFLAGS = $(shell $(ROSE_INSTALLED)/bin/rose-config cppflags)
CXXFLAGS = $(shell $(ROSE_INSTALLED)/bin/rose-config cxxflags)
LDFLAGS  = $(shell $(ROSE_INSTALLED)/bin/rose-config ldflags)
LIBDIRS  = $(shell $(ROSE_INSTALLED)/bin/rose-config libdirs)

MOSTLYCLEANFILES =

##############################################################################
# Assuming your source code is "demo.C" to build an executable named "demo".

all: demo

demo.o: demo.C
   $(CXX) $(CPPFLAGS) $(CXXFLAGS) -o $@ -c $^ 

demo: demo.o
   $(CXX) $(CXXFLAGS) -o $@ $^ $(LDFLAGS)
   @echo "Remember to set:" 
   @echo "  LD_LIBRARY_PATH=$(LIBDIRS):$$LD_LIBRARY_PATH"

MOSTLYCLEANFILES += demo demo.o

##############################################################################
# Standard boilerplate

.PHONY: clean 
clean:
   rm -f $(MOSTLYCLEANFILES)

Complete examples

There are project examples demonstrating different ways of building your projects using ROSE's headers/libraries.

They are available at: https://github.com/chunhualiao/rose-project-templates

A few templates for independent projects using ROSE. By independent, we mean the projects are located outside of ROSE's source tree.

template-project-v1 : using Makefile to build the project
template-project-v2 : using Makefile to build and run a ROSE plugin

How to debug a translator

It is rare that your translator will just work after your finish up coding. Using gdb to debug your code is indispensable to make sure your code works as expected. This page shows examples of how to debug your translator.

Preparations

First and foremost, make sure your ROSE installation and your translator was built with -g and without GCC optimizations turned on. This will ensure all debug information will be best preserved.

To configure ROSE installation with debugging options, you can add the following options to your normal configuration.

 ../rose/configure—with-CXX_DEBUG=-g --with-C_OPTIMIZE=-O0—with-CXX_OPTIMIZE=-O0  ...

If you already built ROSE but forgot what options you used, you can go to your buildtree of ROSE to double check if debugging options are used:

cd buildDebug/
-bash-4.2$ head config.log

  $ ../sourcetree/configure --with-java=/path/to/java/jdk/1.8.0_131 --with-boost=/path/to/boost/1_60_0/gcc/4.9.3 --with-CXX_DEBUG=-g --with-C_OPTIMIZE=-O0 --with-CXX_OPTIMIZE=-O0 --enable-languages=c++,fortran

Before you debug your own translators, you may want to doublecheck if ROSE's builtin translator (rose-compiler) can handle your input code properly. If not, you should report the bug to the ROSE team.

If rose-compiler can handle it but your customized translator cannot. The problem may be caused by the customizations you introduced in your translators.

Another thing is to reduce your input code to be as small as possible so it can just trigger the error you are interested in. This will simplify the bug hunting process dramatically. It is very difficult to debug a translator processing thousands of lines of code.

Basics of GDB

gdb is a debugger. It provides a controlled execution environment for you to inspect if your program is running the way you expected.

Essentially, it allows you to:

run your program within a controlled debugging environment: using gdb—args <program> <args...>
- or libtool—mode=execute gdb—args <progra> <args...> for libtool built executables.
stop at desired execution points
- normal breakpoints (called breakpoints): using break <where>, <where> can be a function name, line_number, or file:line_number.
- when value changes for a given variable(called watchpoint): using watch <where>
- segmentation fault : this will happen automatically, so you can inspect how a seg fault happens
- assertion failure: this will happen automatically, so you can debug assertion failures.
inspect and even change things like variables, types, etc. once your program stops at desired execution points
- inspect the call stack at the breakpoint: using backtrace or bt in short. frame <frame#> to go to the stack frame of your interests.
- look around relevant source code near the breakpoint: using list [+|-|filename:linenumber|filename:function]
- inspect the values of variables and expressions: using print <what>, <what> can be any variable, expression, or even function call.
- inspect the type of a variable: whatis variable_name
- change the content of a variable to a given value: set <var_name>=<value>
- call functions: using print function_name, this is helpful to call some dump functions for some class objects.
control the execution further
- step one statement at a time, through the execution of your program: you can step through at the current frame (next), step down into a frame (step), or step out the current stack frame (finish),
- continue the execution until next breakpoint or watchpoint: using continue or c in short
- return from a function immediately, passing a given value: return <expression>
and other things.

For a quick overview, you can look through a cheat sheet online:

https://kapeli.com/cheat_sheets/GDB.docset/Contents/Resources/Documents/index

From Rob, There is a curses-based wrapper called "cgdb" [1].

You get a split window: the bottom is the GDB console and the top is syntax-highlighted source code that automatically tracks your current location and supports PageUp/PageDn, which is a lot easier to use than GDB's "list" command.
it requires ncurses-devl and readline-devel to install.

A translator not built by ROSE's build system

This is also called out-of-sourcetree build for some people.

If the translator is built using a makefile without using libtool. The debugging steps of your translator are just classic steps to use gdb.

Make sure your translator is compiled with the GNU debugging option -g so there is debugging information in your object codes

These are the steps of a typical debugging session:

1. Set a breakpoint

2. Examine the execution path to make sure the program goes through the path that you expected

3. Examine the local data to validate their values

# how to print out information about a AST node
#-------------------------------------
(gdb) print n
$1 = (SgNode *) 0xb7f12008

# Check the type of a node
#-------------------------------------
(gdb) print n->sage_class_name()
$2 = 0x578b3af "SgFile"

(gdb) print n->get_parent()
$7 = (SgNode *) 0x95e75b8

# Convert a node to its real node type then call its member functions
#---------------------------
(gdb) isSgFile(n)->getFileName ()

#-------------------------------------
# When displaying a pointer to an object, identify the actual (derived) type of the object 
# rather than the declared type, using the virtual function table. 
#-------------------------------------
(gdb) set print object on
(gdb) print astNode
$6 = (SgPragmaDeclaration *) 0xb7c68008

# unparse the AST from a node
# Only works for AST pieces with full scope information
# It will report error if scope information is not available at any ancestor level.
#-------------------------------------
(gdb) print n->unparseToString()

# print out Sg_File_Info 
#-------------------------------------
(gdb) print n->get_file_info()->display()

Example 1: debugging an AST traversal

We first prepare the example ROSE-based analyzer traversing AST to find loops. Rename it to be demo.C:

wget https://raw.githubusercontent.com/rose-compiler/rose/develop/tutorial/visitorTraversal.C
mv visitorTraversal.C demo.C

We can look into the example analyzer's source code: cat demo.C Essentially, we can see the following content:

  4 #include "rose.h"
  5 
  6 class visitorTraversal : public AstSimpleProcessing
  7    {
  8      public:
  9           visitorTraversal();
 10           virtual void visit(SgNode* n);
 11           virtual void atTraversalEnd();
 12    };
 13 
 14 visitorTraversal::visitorTraversal()
 15    {
 16    }
 17 
 18 void visitorTraversal::visit(SgNode* n)
 19    {
 20      if (isSgForStatement(n) != NULL)
 21         {
 22           printf ("Found a for loop ... \n");
 23         }
 24    }
 25 
 26 void visitorTraversal::atTraversalEnd()
 27    {
 28      printf ("Traversal ends here. \n");
 29    }
 30 
 31 int
 32 main ( int argc, char* argv[] )
 33    {
 34   // Initialize and check compatibility. See Rose::initialize
 35      ROSE_INITIALIZE;
 36 
 37      if (SgProject::get_verbose() > 0)
 38           printf ("In visitorTraversal.C: main() \n");
 39 
 40      SgProject* project = frontend(argc,argv);
 41      ROSE_ASSERT (project != NULL);
 42 
 43   // Build the traversal object
 44      visitorTraversal exampleTraversal;
 45 
 46   // Call the traversal function (member function of AstSimpleProcessing)
 47   // starting at the project node of the AST, using a preorder traversal.
 48      exampleTraversal.traverseInputFiles(project,preorder);
 49 
 50      return 0;
 51    }

A ROSE-based tool initializes ROSE first (at line 35). Then the frontend() function is called to parse an iput code and generate an AST rooted at project of SgProject type (at line 40).

After that, a traversal object is declared at line 44. The object is used to traverse the input files of the project, using a preorder traversal.

The traversal object is based on a derived visitorTraversal class at line 6. This derived class has member functions to define what should happen during construction (line 14), visiting a node (line 18), and the end of the traversal (line 26).

Now get a sample makefile to build the source file into an executable file:

wget https://raw.githubusercontent.com/rose-compiler/rose/develop/tutorial/SampleMakefile

The makefile should be self-explanatory. It uses rose-config in the installation path to set various environment variables for compilers, compilation and linking flags, library path, etc.

Get an example input code for the analyzer:

wget https://raw.githubusercontent.com/rose-compiler/rose/develop/tutorial/inputCode_ExampleTraversals.C

The input code has two for-loops at line 20 and 41, as shown at link

Prepare the environment variable used to specify where ROSE is installed.

export ROSE_HOME=/home/freecc/install/rose_install

Build the analyzer:

make -f SampleMakefile

There should be an executable file named demo under the current directory:

Finally, run the demo analyzer to process the example input code:

./demo -c inputCode_ExampleTraversals.C

The analyzer should find two for loops and report the end of the traveral.

Found a for loop ...
Found a for loop ...
Traversal ends here.

Debug The Translator

Now let's debug this simple translator.

First of all, use gdb -args to run the translator with options

gdb -args ./demo -c inputCode_ExampleTraversals.C

// r means run: It is usually a good practice to run the program without setting breakpoints first to see if it can run normally
//     Or to reproduce an assertion error or seg fault
(gdb) r
Starting program: /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/./demo -c inputCode_ExampleTraversals.C
...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Found a for loop ...
Found a for loop ...
Traversal ends here.
[Inferior 1 (process 44697) exited normally]
...
(gdb)

// This program has no errors. So we set a break point at line 22 of demo.C

(gdb) b demo.C:22
Breakpoint 1 at 0x40b0e2: file demo.C, line 22.

// We expect this breakpoint will be hit twice since the input code has only two loops. We try to verify this:
(gdb) r
Starting program: /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/./demo -c inputCode_ExampleTraversals.C
warning: File "/nfs/casc/overture/ROSE/opt/rhel7/x86_64/gcc/4.9.3/mpc/1.0/mpfr/3.1.2/gmp/5.1.2/lib64/libstdc++.so.6.0.20-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, visitorTraversal::visit (this=0x7fffffffb430, n=0x7fffe87db010) at demo.C:22
22                printf ("Found a for loop ... \n");

// Hit breakpoint 1 once, try to continue to see what will happen

(gdb) c
Continuing.
Found a for loop ...

Breakpoint 1, visitorTraversal::visit (this=0x7fffffffb430, n=0x7fffe87db138) at demo.C:22
22                printf ("Found a for loop ... \n");

// Hit breakpoint 1 for the second time, try to continue

(gdb) c
Continuing.
Found a for loop ...
Traversal ends here.
[Inferior 1 (process 46262) exited normally]

// The program terminates now , no more stop at breakpoint 1.

// ----------now we inspect the variable n at the breakpoint 1
// return the program and hit Breakpoint 1
(gdb) r

Breakpoint 1, visitorTraversal::visit (this=0x7fffffffb430, n=0x7fffe87db010) at demo.C:22
22                printf ("Found a for loop ... \n");

//print out the casted n : it is indeed a SgForStatement

(gdb) p isSgForStatement(n)
$1 = (SgForStatement *) 0x7fffe87db010

// Inspect the file info of this ForStatement, understanding where it is coming from in the source code.
 
(gdb) p isSgForStatement(n)->get_file_info()->display()
Inside of Sg_File_Info::display() of this pointer = 0x7fffe94d58b0
     isTransformation                      = false
     isCompilerGenerated                   = false
     isOutputInCodeGeneration              = false
     isShared                              = false
     isFrontendSpecific                    = false
     isSourcePositionUnavailableInFrontend = false
     isCommentOrDirective                  = false
     isToken                               = false
     isDefaultArgument                     = false
     isImplicitCast                        = false
     filename = /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/inputCode_ExampleTraversals.C
     line     = 20  column = 6
     physical_file_id       = 0 = /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/inputCode_ExampleTraversals.C
     physical_line          = 20
     source_sequence_number = 8726
$2 = void

Inspect post_construction_intialization()

Breakpoints at the post_construction_initialization () are useful to inspect when a node is created and/or if a node has required fields set after construction. For example, going through the callstack (using up and down command in gdb) leading to this function call can inspect if the node has parent or scope pointers set. If not, you can add such operations to fix bugs related NULL pointers.

// ----------- We want to inspect when the SgForStatement nodes are created in the execution
// set a breakpoint at the post_construciton_initialization() method of SgForStatement

(gdb) b SgForStatement::post_construction_initialization()
Breakpoint 2 at 0x7ffff3d6495f: file Cxx_Grammar.C, line 139566.

// Disable Breapoint 1 for now
(gdb) disable 1

(gdb) info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep n   0x000000000040b0e2 in visitorTraversal::visit(SgNode*) at demo.C:22
        breakpoint already hit 1 time
2       breakpoint     keep y   0x00007ffff3d6495f in SgForStatement::post_construction_initialization() at Cxx_Grammar.C:139566

// run until the Breakpoint 2 is hit
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Breakpoint 2, SgForStatement::post_construction_initialization (this=0x7fffe87db010) at Cxx_Grammar.C:139566
139566       if (p_for_init_stmt == NULL) {

//  use backtrace to check the function call stacks leading to this stop of Breakpoint 2. 
//  You can clearly see the callchain from main() all the way to the breakpoint.

(gdb) bt
#0  SgForStatement::post_construction_initialization (this=0x7fffe87db010) at Cxx_Grammar.C:139566
#1  0x00007ffff54e55d8 in SgForStatement::SgForStatement (this=0x7fffe87db010, test=0x0, increment=0x0, loop_body=0x0)
    at Cxx_GrammarNewConstructors.C:5258
#2  0x00007ffff5bb04ce in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x0)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49637
#3  0x00007ffff5bbb5ea in EDG_ROSE_Translation::parse_statement_list (sse=..., orig_kind=iek_statement, orig_ptr=0x115f810)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:53079
#4  0x00007ffff5bb0221 in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x7fffe8934010)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49492
#5  0x00007ffff5c09217 in EDG_ROSE_Translation::parse_function_body<SgFunctionDeclaration> (sse_base=..., p=0x1151ad0, decl=0x7fffe9e21698)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:36262
#6  0x00007ffff5b844fa in EDG_ROSE_Translation::convert_routine (p=0x1151ad0, forceTemplateDeclaration=false, edg_template=0x0,
    optional_nondefiningTemplateDeclaration=0x0) at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:34343
#7  0x00007ffff5b703cf in EDG_ROSE_Translation::parse_routine (sse=..., forceTemplateDeclaration=false, edg_template=0x0, forceSecondaryDeclaration=false)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:29866
#8  0x00007ffff5be6f78 in EDG_ROSE_Translation::parse_global_or_namespace_scope_entity (sse=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:64638
#9  0x00007ffff5bea2df in EDG_ROSE_Translation::parse_global_scope (inputGlobalScope=0x7ffff7ec3120, sse=..., skip_ast_translation=false)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:65427
#10 0x00007ffff5bedbee in sage_back_end (sageFile=...) at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:66777
#11 0x00007ffff5beea8a in cfe_main (argc=44, argv=0x702f80, sageFile=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:66992
#12 0x00007ffff5beebe7 in edg_main (argc=44, argv=0x702f80, sageFile=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:67093
#13 0x00007ffff3c14629 in SgSourceFile::build_C_and_Cxx_AST (this=0x7fffeb45e010, argv=..., inputCommandLine=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:5430
#14 0x00007ffff3c1587a in SgSourceFile::buildAST (this=0x7fffeb45e010, argv=..., inputCommandLine=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:5983
#15 0x00007ffff3c0e5b7 in SgFile::callFrontEnd (this=0x7fffeb45e010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:3119
#16 0x00007ffff3c0b576 in SgSourceFile::callFrontEnd (this=0x7fffeb45e010)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2137
#17 0x00007ffff3c0a005 in SgFile::runFrontend (this=0x7fffeb45e010, nextErrorCode=@0x7fffffffaadc: 0)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:1606
#18 0x00007ffff3c12924 in Rose::Frontend::RunSerial (project=0x7fffeb555010)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:4613
#19 0x00007ffff3c12593 in Rose::Frontend::Run (project=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:4506
#20 0x00007ffff3c0b84d in SgProject::RunFrontend (this=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2209
#21 0x00007ffff3c0bcb2 in SgProject::parse (this=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2334
#22 0x00007ffff3c0b0d4 in SgProject::parse (this=0x7fffeb555010, argv=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2028
#23 0x00007ffff3cbd2e9 in SgProject::SgProject (this=0x7fffeb555010, argv=..., frontendConstantFolding=false) at Cxx_Grammar.C:29114
#24 0x00007ffff645fd54 in frontend (argv=..., frontendConstantFolding=false) at ../../../sourcetree/src/roseSupport/utility_functions.C:628
#25 0x00007ffff645fc10 in frontend (argc=3, argv=0x7fffffffb578, frontendConstantFolding=false)
    at ../../../sourcetree/src/roseSupport/utility_functions.C:590
#26 0x000000000040b152 in main (argc=3, argv=0x7fffffffb578) at demo.C:40
(gdb)

// Again, Breakpoint 2 will be hit twice since we only have two for loops in the input code

(gdb) c
Continuing.

Breakpoint 2, SgForStatement::post_construction_initialization (this=0x7fffe87db138) at Cxx_Grammar.C:139566
139566       if (p_for_init_stmt == NULL) {
(gdb) c
Continuing.
Found a for loop ...
Found a for loop ...
Traversal ends here.
[Inferior 1 (process 47292) exited normally]

Set a condition to Breakpoints

In real codes, there are hundreds of objects of same class type (e.g. SgForStatement). Many of them come from header files and will be present in AST. We should only stop when it mathes the one we want to inspect. Often, we can use the memory address of the object as a condition.

// Add a condition to Breakpoint 2: stop only when the this pointers is equal to a memory address
(gdb) cond 2 (unsigned long)this==(unsigned long)0x7fffe87db138

// run the program: now it will stop only when the condition for Breakpoint 2 is met, skipping all other hits to Breakpoint 2. 
(gdb) r
Starting program: /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/./demo -c inputCode_ExampleTraversals.C
..
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 2, SgForStatement::post_construction_initialization (this=0x7fffe87db138) at Cxx_Grammar.C:139566
139566       if (p_for_init_stmt == NULL) {

// continue the execution, after doing inspections you want. It should go to the normal termination, skipping other hits to Breakpoint 2. 
(gdb) c
Continuing.
Found a for loop ...
Found a for loop ...
Traversal ends here.
[Inferior 1 (process 47785) exited normally]

Use Watchpoints

You can use a watchpoint to stop execution whenever the value of an expression changes, without having to predict a particular place where this may happen. (This is sometimes called a data breakpoint.)

Watchpoints can be treated as special types of breakpoints. They will stop when the watched memory locations have value changes. This is especially useful when you want to know when some variable (or field of an object) is set to some value or cleared its value. For example, often a bug is related to some NULL value of some fields of a node. The fields may be set during construction of the node. But later mysteriously one field becomes NULL. It is extremely hard to find when this happens without using watchpoint.

For example, we want to watch the value changes to the parent field of the SgForStatement matching the memory address of the 2nd loop.

We first stop at a breakpoint where we have access to the node's internal fields. This usually is done by stopping at SgForStatement::post_construction_initialization ().
Once the internal variables are visible in gdb at the proper breakpoint, we can grab the memory address of the internal variable. This requires your knowledge of how internal variables are named. You can either look at the class declaration of the object, or guess it by convention. For example, mostly something with an access function like get_something() has a corresponding internal variable named p_something in ROSE AST node types.
Finally, we have to watch the deferenced value of the memory address (watch *address). Watching the memory address (watch address) is to watch a constant value. It won't work.

(gdb) info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep n   0x000000000040b0e2 in visitorTraversal::visit(SgNode*) at demo.C:22
2       breakpoint     keep y   0x00007ffff3d6495f in SgForStatement::post_construction_initialization() at Cxx_Grammar.C:139566
        stop only if (unsigned long)this==(unsigned long)0x7fffe87db138
        breakpoint already hit 1 time

(gdb) r
Starting program: /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/./demo -c inputCode_ExampleTraversals.C

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 2, SgForStatement::post_construction_initialization (this=0x7fffe87db138) at Cxx_Grammar.C:139566
139566       if (p_for_init_stmt == NULL) {

// the data member storing parent pointer of an AST node is p_parent
// it is now have NULL value 
(gdb) p p_parent
$3 = (SgNode *) 0x0

// we obtain the memory address of p_parent
(gdb) p &p_parent
$4 = (SgNode **) 0x7fffe87db140

// watch value changes of this address
// Must deference the address with * , or it will won't work by saying "Cannot watch constant value"

(gdb) watch *0x7fffe87db140

// We can now watch the value changes to this memory address
// Let's restart the program from the beginning:

(gdb) r
Starting program: /home/liao6/workspace/rose/2019-10-31_14-16-05_-0700/myTranslator/./demo -c inputCode_ExampleTraversals.C
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Hardware watchpoint 2: *0x7fffe87db140

Old value = <unreadable>
New value = 0

SgNode::SgNode (this=0x7fffe87db138) at Cxx_Grammar.C:2128
2128         p_isModified = false;

// we check when the first time its value is changed: the constructor of ancestor node SgNode

(gdb) bt
#0  SgNode::SgNode (this=0x7fffe87db138) at Cxx_Grammar.C:2128
#1  0x00007ffff3d19f01 in SgLocatedNode::SgLocatedNode (this=0x7fffe87db138, startOfConstruct=0x0) at Cxx_Grammar.C:85278
#2  0x00007ffff3d59798 in SgStatement::SgStatement (this=0x7fffe87db138, startOfConstruct=0x0) at Cxx_Grammar.C:134029
#3  0x00007ffff3d59fcc in SgScopeStatement::SgScopeStatement (this=0x7fffe87db138, file_info=0x0) at Cxx_Grammar.C:134289
#4  0x00007ffff54e54e0 in SgForStatement::SgForStatement (this=0x7fffe87db138, test=0x0, increment=0x0, loop_body=0x0)
    at Cxx_GrammarNewConstructors.C:5230
#5  0x00007ffff5bb04ce in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x0)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49637
#6  0x00007ffff5bbb5ea in EDG_ROSE_Translation::parse_statement_list (sse=..., orig_kind=iek_statement, orig_ptr=0x1162200)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:53079
#7  0x00007ffff5bb0221 in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x7fffe8934470)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49492
#8  0x00007ffff5c09217 in EDG_ROSE_Translation::parse_function_body<SgFunctionDeclaration> (sse_base=..., p=0x1151fc0, decl=0x7fffe9e21e68)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:36262
#9  0x00007ffff5b844fa in EDG_ROSE_Translation::convert_routine (p=0x1151fc0, forceTemplateDeclaration=false, edg_template=0x0,
    optional_nondefiningTemplateDeclaration=0x0) at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:34343
#10 0x00007ffff5b703cf in EDG_ROSE_Translation::parse_routine (sse=..., forceTemplateDeclaration=false, edg_template=0x0, forceSecondaryDeclaration=false)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:29866
#11 0x00007ffff5be6f78 in EDG_ROSE_Translation::parse_global_or_namespace_scope_entity (sse=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:64638
#12 0x00007ffff5bea2df in EDG_ROSE_Translation::parse_global_scope (inputGlobalScope=0x7ffff7ec3120, sse=..., skip_ast_translation=false)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:65427
#13 0x00007ffff5bedbee in sage_back_end (sageFile=...) at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:66777
#14 0x00007ffff5beea8a in cfe_main (argc=44, argv=0x702f80, sageFile=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:66992
#15 0x00007ffff5beebe7 in edg_main (argc=44, argv=0x702f80, sageFile=...)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:67093
#16 0x00007ffff3c14629 in SgSourceFile::build_C_and_Cxx_AST (this=0x7fffeb45e010, argv=..., inputCommandLine=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:5430
#17 0x00007ffff3c1587a in SgSourceFile::buildAST (this=0x7fffeb45e010, argv=..., inputCommandLine=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:5983
#18 0x00007ffff3c0e5b7 in SgFile::callFrontEnd (this=0x7fffeb45e010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:3119
#19 0x00007ffff3c0b576 in SgSourceFile::callFrontEnd (this=0x7fffeb45e010)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2137
#20 0x00007ffff3c0a005 in SgFile::runFrontend (this=0x7fffeb45e010, nextErrorCode=@0x7fffffffaadc: 0)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:1606
#21 0x00007ffff3c12924 in Rose::Frontend::RunSerial (project=0x7fffeb555010)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:4613
#22 0x00007ffff3c12593 in Rose::Frontend::Run (project=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:4506
#23 0x00007ffff3c0b84d in SgProject::RunFrontend (this=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2209
#24 0x00007ffff3c0bcb2 in SgProject::parse (this=0x7fffeb555010) at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2334
#25 0x00007ffff3c0b0d4 in SgProject::parse (this=0x7fffeb555010, argv=...)
    at ../../../../sourcetree/src/frontend/SageIII/sage_support/sage_support.cpp:2028
#26 0x00007ffff3cbd2e9 in SgProject::SgProject (this=0x7fffeb555010, argv=..., frontendConstantFolding=false) at Cxx_Grammar.C:29114
#27 0x00007ffff645fd54 in frontend (argv=..., frontendConstantFolding=false) at ../../../sourcetree/src/roseSupport/utility_functions.C:628
#28 0x00007ffff645fc10 in frontend (argc=3, argv=0x7fffffffb578, frontendConstantFolding=false)
    at ../../../sourcetree/src/roseSupport/utility_functions.C:590
#29 0x000000000040b152 in main (argc=3, argv=0x7fffffffb578) at demo.C:40

// We continue the execution

(gdb) c
Continuing.
Hardware watchpoint 2: *0x7fffe87db140

Old value = 0
New value = -393001872
SgNode::set_parent (this=0x7fffe87db138, parent=0x7fffe8934470) at Cxx_Grammar.C:1684
1684         if ( ( variantT() == V_SgClassDeclaration ) && ( parent != NULL && parent->variantT() == V_SgFunctionParameterList ) )

//  Now we found that this p_parent field is set by calling set_parent(). We can inspect the call stack and other things of interests
(gdb) bt
#0  SgNode::set_parent (this=0x7fffe87db138, parent=0x7fffe8934470) at Cxx_Grammar.C:1684
#1  0x00007ffff5bb04ef in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x0)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49643
#2  0x00007ffff5bbb5ea in EDG_ROSE_Translation::parse_statement_list (sse=..., orig_kind=iek_statement, orig_ptr=0x1162200)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:53079
#3  0x00007ffff5bb0221 in EDG_ROSE_Translation::parse_statement (sse=..., existingBasicBlock=0x7fffe8934470)
    at ../../../../../../sourcetree/src/frontend/CxxFrontend/EDG/edgRose/edgRose.C:49492
.... // omitted

(gdb) c
Continuing.
Found a for loop ...
Found a for loop ...
Traversal ends here.
[Inferior 1 (process 54495) exited normally]

// No more value changes to the same memory address, as expected.

A translator shipped with ROSE

This is also called in-tree or in-sourcetree build. libtool is used to build the translators.

ROSE turns on -O2 and -g by default so the translators shipped with ROSE should already have some debugging information available. But some variables may be optimized away. To preserve the max debugging information, you may have to reconfigure/recompile rose to turn off optimizations.

../sourcetree/configure—with-CXX_DEBUG=-g --with-C_OPTIMIZE=-O0—with-CXX_OPTIMIZE=-O0  ...

ROSE uses libtool so the executables in the build tree are not real—they're simply wrappers around the actual executable files. You have two choices:

Find the real executable in the .lib directory then debug the real executables there
Use libtool command line as follows:

$ libtool --mode=execute gdb --args ./built_in_translator file1.c

If you can set up alias command in your .bashrc, add the following:

alias debug='libtool --mode=execute gdb -args'

then all your debugging sessions can be as simple as

$ debug ./built_in_translator file1.c

The remaining steps are the same as a regular gdb session with the typical operations, such as breakpoints, printing data, etc.

Example 2: Fixing a real bug in ROSE

1. Reproduce the reported bug:

$ make check
...
./testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I$ROSE/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I$ROSE/sourcetree/tests/CompileTests/A++Code \
    -c $ROSE/sourcetree/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C

...
lt-testVirtualCFG: $ROSE/src/frontend/SageIII/virtualCFG/virtualCFG.h:111:
    VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode, VirtualCFG::CFGNode):
    Assertion `src.getNode() != __null && tgt.getNode() != __null' failed.

Ah, so we've failed an assertion within the virtualCFG.h header file on line 111:

Assertion `src.getNode() != __null && tgt.getNode() != __null' failed

And the error was produced by running the lt-testVirtualCFG libtool executable translator, i.e. the actual translator name is testVirtualCFG (without the lt- prefix).

2. Run the same translator command line with Libtool to start a GDB debugging session:

$ libtool --mode=execute gdb --args ./testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I$ROSE/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I$ROSE/sourcetree/tests/CompileTests/A++Code \
    -c $ROSE/sourcetree/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C

GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5_8.1)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from ${ROSE_BUILD_TREE}tests/CompileTests/virtualCFG_tests/.libs/lt-testVirtualCFG...done.
(gdb)

The GDB session has started, and we're provided with a command line prompt to begin our debugging.

3. Let's run the program, which will hit the failed assertion:

(gdb) r
Starting program: \
    ${ROSE_BUILD_TREE}/tests/CompileTests/virtualCFG_tests/.libs/lt-testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I${ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I../../../../sourcetree/tests/CompileTests/A++Code
    -c   ${ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
[Thread debugging using libthread_db enabled]
lt-testVirtualCFG: ${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111:

VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode, VirtualCFG::CFGNode): Assertion `src.getNode() != __null && tgt.getNode() != __null' failed.

Program received signal SIGABRT, Aborted.
0x0000003752230285 in raise () from /lib64/libc.so.6

Okay, we've reproduced the problem in our GDB session.

4. Let's check the backtrace to see how we wound up at this failed assertion:

(gdb) bt
#0  0x0000003752230285 in raise () from /lib64/libc.so.6
#1  0x0000003752231d30 in abort () from /lib64/libc.so.6
#2  0x0000003752229706 in __assert_fail () from /lib64/libc.so.6

#3  0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge (this=0x7fffffffb300, src=..., tgt=...)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.h:111
#4  0x00002aaaad643b60 in makeEdge<VirtualCFG::CFGNode, VirtualCFG::CFGEdge> (from=..., to=..., result=...)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:82
#5  0x00002aaaad62ef7d in SgReturnStmt::cfgOutEdges (this=0xbfaf10, idx=1)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:1471
#6  0x00002aaaad647e69 in VirtualCFG::CFGNode::outEdges (this=0x7fffffffb530)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.C:636
#7  0x000000000040bf7f in getReachableNodes (n=..., s=...) at ${ROSE}/tests/CompileTests/virtualCFG_tests/testVirtualCFG.C:13
...

5. Next, we'll move backwards (or upwards) in the program to get to the point of assertion:

(gdb) up
#1  0x0000003752231d30 in abort () from /lib64/libc.so.6

(gdb) up
#2  0x0000003752229706 in __assert_fail () from /lib64/libc.so.6

(gdb) up
#3  0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge (this=0x7fffffffb300, src=..., tgt=...)
     at ${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111
111         CFGEdge(CFGNode src, CFGNode tgt): src(src), tgt(tgt) \
                   { assert(src.getNode() != NULL && tgt.getNode() != NULL); }

Okay, so the assertion is inside of a constructor for CFGEdge:

CFGEdge(CFGNode src, CFGNode tgt): src(src), tgt(tgt) \
{
    assert(src.getNode() != NULL && tgt.getNode() != NULL);  # This is the failed assertion
}

Unfortunately, we can't tell at a glance which of the two conditions in the assertion is failing.

6. Figure out why the assertion is failing:

Let's examine the two conditions in the assertion:

(gdb) p src.getNode()
$1 = (SgNode *) 0xbfaf10

So src.getNode() is returning a non-null pointer to an SgNode. How bout tgt.getNode()?

(gdb) p tgt.getNode()
$2 = (SgNode *) 0x0

Ah, there's the culprit. So for some reason, tgt.getNode() is returning a null SgNode pointer (0x0).

From here, we used the GDB up command to backtrace in the program to figure out where the node returned by tgt.getNode() was assigned a NULL value.

We eventually found a call to SgReturnStmt::cfgOutEdges which returns a variable, called enclosingFunc. In the source code, there's currently no assertion to check the value of enclosingFunc, and that's why we received the assertion later on in the program. As a side note, it is good practice to add assertions as soon as possible in your source code so in times like this, we don't have to spend time unnecessarily back-tracing.

After adding the assertion for enclosingFunc, we run the program again to reach this new assertion point:

lt-testVirtualCFG: ${ROSE}sourcetree/src/frontend/SageIII/virtualCFG/memberFunctions.C:1473: \
    virtual std::vector<VirtualCFG::CFGEdge, std::allocator<VirtualCFG::CFGEdge> > \
    SgReturnStmt::cfgOutEdges(unsigned int): \

    Assertion `enclosingFunc != __null' failed.

Okay, it's failing so we know that the assignment to enclosingFunc is NULL.

# enclosingFunc is definitely NULL (0x0)
(gdb) p enclosingFunc
$1 = (SgFunctionDefinition *) 0x0

# What is the current context?
(gdb) p this
$2 = (SgReturnStmt * const) 0xbfaf10

Okay, we're inside of an SgReturnStmt object. Let's set a break point where enclosingFunc is being assigned to:

Breakpoint 1, SgReturnStmt::cfgOutEdges (this=0xbfaf10, idx=1) at ${ROSE}/src/frontend/SageIII/virtualCFG/memberFunctions.C:1472
1472              SgFunctionDefinition* enclosingFunc = SageInterface::getEnclosingProcedure(this);

So this is the line we're examining:

SgFunctionDefinition* enclosingFunc = SageInterface::getEnclosingProcedure(this);

So the NULL value must be coming from SageInterface::getEnclosingProcedure(this);.

After code reviewing the function getEnclosingProcedure, we discovered a flaw in the algorithm.

The function tries to return a SgNode which is the enclosing procedure of the specified type, SgFunctionDefinition. However, upon checking the function's state at the point of return, we see that it incorrectly detected a SgBasicBlack as the enclosing procedure for the SgReturnStmt.

(gdb) p parent->class_name()
$12 = {static npos = 18446744073709551615,
   _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7cd0e8 "SgBasicBlock"}}

Specifically, the last piece: 0x7cd0e8 "SgBasicBlock".

But this is wrong because we're looking for SgFunctionDefinition, not SgBasicBlock.

Upon further examination, we figured out that the function simply returned the first enclosing node it found, and not the first enclosing node that matched the user's criteria.

We added the necessary logic to make the function complete, tested it to verify its correctness, and then resolved the bug.

How to add a new project directory

Most code development that is layered above the ROSE library starts out its life as a project in the projects directory. Some projects are eventually refactored into the ROSE library once they mature. This chapter describes how one adds a new project to ROSE.

Method 1: New simple ways to add

Robb Matzke added a new feature in ROSE so you can more easily add a new project into ROSE/projects

Create a $ROSE/projects/whatever directory.
In that directory, create a "rose.config" file
In that file, add the line AC_CONFIG_FILES(projects/whatever/Makefile)

rose/config/support-projects.m4 will be updated by running ./build.

You still need to have your Makefile.am. One simplest example is

Method 2: Required Files

A ROSE project encapsulates a complete program or set of related programs that use the ROSE library. Each project exists as a subdirectory of the ROSE "projects" directory and should include files "README", "config/support-rose.m4", "Makefile.am", and any necessary source files, scripts, tests, etc.

The "README" should provide an explanation about the project purpose, algorithm, design, implementation, etc.
The "support-rose.m4" integrates the project into the ROSE build system in a manner that allows the project to be an optional component (they can be disabled, renamed, deleted, or withheld from distribution without changing any ROSE configuration files). Most older projects are lacking this file and are thus more tightly coupled with the build system.
The "Makefile.am" serves as the input to the GNU automake system that ROSE employs to generate Makefiles.
Each project should also include all necessary source files, documentation, and test cases.

Setting up support-rose.m4

The "config/support-rose.m4" file integrates the project into the ROSE configure and build system. At a minimum, it should contain a call to the autoconf AC_CONFIG_FILES macro with a list of the project's Makefiles (without the ".am" extension) and its doxygen configuration file (without the ".in" extension). It may also contain any other necessary autoconf checks that are not already performed by ROSE's main configure scripts, including code to enable/disable the project based on the availability of the project's prerequisites.

Here's an example:

dnl List of all makefiles and autoconf-generated                          -*- autoconf -*-
dnl files for this project
AC_CONFIG_FILES([projects/DemoProject/Makefile
                 projects/DemoProject/gui/Makefile
                 projects/DemoProject/doxygen/doxygen.conf
                ])

dnl Even if this project is present in ROSE's "projects" directory, we might not have the
dnl prerequisites to compile this project.  Enable the project's makefiles by using the
dnl ROSE_ENABLE_projectname automake conditional.  Many prerequisites have probably already
dnl been tested by ROSE's main configure script, so we don't need to list them here again
dnl (although it usually doesn't hurt).
AC_MSG_CHECKING([whether DemoProject prerequisites are satisfied])
if test "$ac_cv_header_gcrypt_h" = "yes"; then
        AC_MSG_RESULT([yes])
        rose_enable_demo_project=yes
else
        AC_MSG_RESULT([no])
        rose_enable_demo_project=
fi
AM_CONDITIONAL([ROSE_ENABLE_DEMO_PROJECT], [test "$rose_enable_demo_project" = yes])

Since all configuration for the project is encapsulated in the "support-rose.m4" file, renaming, disabling, or removing the project is trivial: a project can be renamed simply by renaming its directory, it can be disabled by renaming/removing "support-rose.m4", or it can be removed by removing its directory. The "build" and "configure" scripts should be rerun after any of these changes.

Since projects are self-encapsulated and optional parts of ROSE, they need not be distributed with ROSE. This enables end users to drop in their own private projects to an existing ROSE source tree without modifying any ROSE files, and it allows ROSE developers to work on projects that are not distributed publicly. Any project directory that is not part of ROSE's main Git repository will not be distributed (this includes not distributing Git submodules, although the submodule's placeholder empty directory will be distributed).

Setting up Makefile.am

Each project should have at least one Makefile.am, each of which is processed by GNU automake and autoconf to generate a Makefile. See documentation for automake for details about what these files should contain. Some important variables and targets are:

include $(top_srcdir)/config/Makefile.for.ROSE.includes.and.libs: This brings in the definitions from the higher level Makefiles and is required by all projects. It should be near the top of the Makefile.am.
SUBDIRS: This variable should contain the names all the project's subdirectories that have Makefiles. It may be omitted if the project's only Makefile is in that project's top-level directory.
INCLUDES: This would have the flags that need to be added during compilation (flags like -I$(top_srcdir)/projects/RTC/include). Your flags should be placed before $(ROSE_INCLUDES) to ensure the correct files are found. This brings in all the necessary headers from the src directory to your project.
lib_*: These variables/targets are necessary if you are creating a library from your project, which can be linked in with other projects or the src directory later. This is the recommended way of handling projects.
EXTRA_DIST: These are the files that are not listed as being needed to build the final object (like source and header files), but must still be in the ROSE tarball distribution. This could include README or configuration files, for example.
check-local: This is the target that will be called from the higher level Makefiles when make check is called.
clean-local: Provides you with a step to perform manual cleanup of your project, for instance, if you manually created some files (so Automake won't automatically clean them up).

A basic example

Many projects start as a translator, analyzer or optimizer, which takes into input code and generate output.

A basic sample commit which adds a new project directory into ROSE: https://github.com/rose-compiler/rose/commit/edf68927596960d96bb773efa25af5e090168f4a

Please look through the diffs so you know what files to be added and changed for a new project.

Essentially, a basic project should contain

a README file explaining what this project is about, algorithm, design, implementation, etc
a translator acts as a driver of your project
additional source files and headers as needed to contain the meat of your project
test input files
Makefile.am to
- compile and build your translator
- contain make check rule so your translator will be invoked to process your input files and generate expected results

To connect your project into ROSE's build system, you also need to

Add one more subdir entry into projects/Makefile.am for your project directory
Add one line into config/support-rose.m4 for EACH new Makefile (generated from each Makefile.am) used by your projects.

Installing project targets

Install your project's content to a separate directory within the user's specified --prefix location. The reason behind this is that we don't want to pollute the core ROSE installation space. By doing so, we can reduce the complexity and confusion of the ROSE installation tree, while eliminating cross-project file collisions. It also keeps the installation tree modular.

Example

This example uses a prefix for installation. It also maintains Semantic Versioning.

From projects/RosePoly:

  ## 1. Version your project properly (http://semver.org/)
  rosepoly_API_VERSION=0.1.0

  ## 2. Install to separate directory
  ##
  ##    Installation tree should resemble:
  ##
  ##    <--prefix>
  ##    |--bin      # ROSE/bin
  ##    |--include  # ROSE/include
  ##    |--lib      # ROSE/lib
  ##    |
  ##    |--<project>-<version>
  ##       |--bin      # <project>/bin
  ##       |--include  # <project>/include
  ##       |--lib      # <project>/lib
  ##
  exec_prefix=${prefix}/rosepoly-$(rosepoly_API_VERSION)

  ## Installation/include tree should resemble:
  ##   |--<project>-<version>
  ##      |--bin      # <project>/bin
  ##      |--include  # <project>/include
  ##         |--<project>
  ##      |--lib      # <project>/lib
  librosepoly_la_includedir = ${exec_prefix}/include/rosepoly

Generate Doxygen Documentation

0. Install Doxygen tool

Using MacPorts for Apple's Mac OS:

  $ port install doxygen

  # set path to MacPort's bin/
  # ...

Using one of the LLNL machines:

  $ export PATH=/nfs/apps/doxygen/latest/bin:$PATH

1. Create a Doxygen configuration file

  $ doxygen -g

Configuration file `Doxyfile' created.

Now edit the configuration file and enter

  doxygen Doxyfile

to generate the documentation for your project

2. Customize the configuration file (Doxyfile):

...

# If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in
# documentation are documented, even if no documentation was available.
# Private class members and static file members will be hidden unless
# the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES

EXTRACT_ALL            = YES

...

# If the value of the INPUT tag contains directories, you can use the
# FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp
# and *.h) to filter out the source-files in the directories. If left
# blank the following patterns are tested:
# *.c *.cc *.cxx *.cpp *.c++ *.d *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh
# *.hxx *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm *.dox *.py
# *.f90 *.f *.for *.vhd *.vhdl

FILE_PATTERNS          = *.cpp *.hpp

# The RECURSIVE tag can be used to turn specify whether or not subdirectories
# should be searched for input files as well. Possible values are YES and NO.
# If left blank NO is used.

RECURSIVE              = YES

...

3. Generate the Doxygen documentation

  # Invoke from your top-level directory
  $ doxygen Doxyfile

4. View and verify the HTML documentation

  $ firefox html/index.html &

5. Add target to your Makefile.am to generate the documentation

.PHONY: docs
docs:
    doxygen Doxyfile # TODO: should be $(DOXYGEN)

How to fix a bug

If you are trying to fix a bug ( your own or a bug assigned to you to fix). Here are high level steps to do the work

Reproduce the bug

You can only fix a bug when you can reproduce it. This step may be more difficult than it sounds. In order to reproduce a bug, you have to

find a proper input file
find a proper translator: a translator shipped with ROSE is easy to find. But be patient and sincere when you ask for a translator written by users.
find a similar/identical software and hardware environment: a bug may only appear on a specific platform when a specific software configuration is used

Possible results for this step:

You can reproduce the bug reliably. Bingo! Go to the next step.
You cannot reproduce the bug. Either the bug report is invalid or you have to keep trying.
You can reproduce the bug once a while (random errors). Oops. This is kind of difficult situation.

Find causes of the bug

Once you can reproduce the bug. You have to identify the root cause of the bug using a debugger like gdb.

Common steps involved

simplify the input code as much as possible: It can be very hard to debug a problem with a huge input. Always try to prepare the simplest possible code which can just trigger the bug.
- Often, you have to use a binary search approach to narrow down the input code: only use half of the input at a time to try. Recursively cut the input file into two parts until no further cut is possible while you can still trigger the bug.
forward tracking: for the translator, it usually takes input and generate intermediate results before the final output is generated. Using a debugger to set break points at each critical stages of the code to check if the intermediate results are what you expect.
backwards tracking: similar to the previous techniques. But you just back tracking the problem.

Fix the bug

Any bug fix commit should contain

a regression test: so make check rules can make sure the bug is actually fixed and no further code changes will make the bug relapse.

How to add a ROSE commandline option

Often a feature added into ROSE comes with a set of command line options. These options can enable and customize the feature.

For example, the OpenMP support in ROSE is disabled by default. A special option is need to enable it. Also, the support can be as little as simply parsing the OpenMP directive or as complex as translating into multithreaded code.

This HOWTO quickly go through key steps to add options.

internal flags

Options need to be stored somewhere. There are several choices for the storage,

as a data member of SgProject , if the optiona is applicable to all files associated with a SgProject
as a data member of SgFile, if the option is applicable to a single source file, or
a member variable in a namespace you define, if the option is for some transformation or analysis.

If the option can be as specific as per file, it is recommended to add a new data member to SgFile to save the option value.

For example, here is a command line option to turn on the UPC language support:

ROSE/src/ROSETTA/src/support.C // add a date member for SgFile

    // Liao (6/6/2008): Support for UPC model of C , 6/19/2008: add support for static threads compilation
    File.setDataPrototype         ( "bool", "UPC_only", "= false",
                                    NO_CONSTRUCTOR_PARAMETER, BUILD_ACCESS_FUNCTIONS, NO_TRAVERSAL, NO_DELETE);

ROSETTA process this information to automatically generate a member and the corresponding member access functions (set/get_member()).

process the option

Command line options should be handled within src/frontend/SageIII/sage_support/cmdline.cpp .

File level options are handled by void SgFile::processRoseCommandLineOptions ( vector<string> & argv )

Example code for processing the -rose:openmp option

     set_openmp(false);
     ROSE_ASSERT (get_openmp() == false);
       ...
     if ( CommandlineProcessing::isOption(argv,"-rose:","(OpenMP|openmp)",true) == true )
        {
          if ( SgProject::get_verbose() >= 1 )
               printf ("OpenMP option specified \n");
          set_openmp(true);
         //side effect for enabling OpenMP, define the macro as required
           argv.push_back("-D_OPENMP");
        }

ROSE commandline options should be removed after being processed, to avoid confusing the backend compiler

SgFile::stripRoseCommandLineOptions ( vector<string>& argv ) should have the code to strip off the option.

use the option

In your code, you can use the automatically generated access functions to set/retrieve the stored option values.

For example

  if (sourceFile->get_openmp())
     //... do something here ....

document the option

Any option should be explained by the online help output.

Please add brief help text for your option in void SgFile::usage ( int status ) of ./src/frontend/SageIII/sage_support/cmdline.cpp: