ROSE Compiler Framework/Coding Standard

What to Expect and What to Avoid

edit

This page documents the current recommended practice of how we should write code within the ROSE project. It also serves as a guideline for our code review process.

New code should follow the conventions described in this document from the very beginning.

Updates to existing code that follows a different coding style should only be performed if you are the maintainer of the code.

The order of sections in coding standard follows a top-down approach: big things first, then drill down to fine-grain details.

Six Principles

edit

We use coding standard to reflect the principal things we value for all contributions to ROSE

  • Documentation: What are the commits about? Is this reflected in commit messages, README, source comments, or LaTex files within the same commits?
  • Style: Is the coding style consistent with the required and recommended formats? Is the code clean and pleasant and easy to read?
  • Interface: Does the code have a clean and simple interface to be used by users?
  • Algorithm: Does the code have sufficient comments about what algorithm is used? Is the algorithm correct and efficient (space and time complexity)?
  • Implementation: Does the implementation correctly implement the documented algorithms?
  • Testing: Does the code have the accompanying test translator and input to ensure the contributions do what they are supposed to do?
    • Is Jenkins being configured to trigger these tests? Local tests on developer's workstation do not count.

Avoid Coding Standard War

edit

We directly quote text from http://www.parashift.com/c++-faq/coding-std-wars.html, as follows:

"Nearly every software engineer has, at some point, been exploited by someone who used coding standards as a power play. Dogmatism over minutia is the purvue of the intellectually weak. Don't be like them. These are those who can't contribute in any meaningful way, who can't actually improve the value of the software product, so instead of exposing their incompetence through silence, they blather with zeal about nits. They can't add value in the substance of the software, so they argue over form. Just because "they" do that doesn't mean coding standards are bad, however.

Another emotional reaction against coding standards is caused by coding standards set by individuals with obsolete skills. For example, someone might set today's standards based on what programming was like N decades ago when the standards setter was writing code. Such impositions generate an attitude of mistrust for coding standards. As above, if you have been forced to endure an unfortunate experience like this, don't let it sour you to the whole point and value of coding standards. It doesn't take a very large organization to find there is value in having consistency, since different programmers can edit the same code without constantly reorganizing each others' code in a tug-of-war over the "best" coding standard."

Must, Should and Can

edit

The terms must, should and can have special meaning.

  • A must requirement must be followed,
  • A should is a strong recommendation,
  • A can is a general guideline.

Got New Ideas, Suggestions

edit

This is not a place to write down the new ideas/concepts/suggestions to be used in the future. If you have suggestions, put into the discussion tab link of this page.

We do welcome suggestions for improvements and changes so we can do things faster and better.

Git Convention

edit

Name and Email

edit

Before you commit your local changes, you MUST ensure that you have correctly configured your author and email information (on all of your machines). Having a recognizable and consistent name and email will make it easier for us to evaluate the contributions that you've made to our project.

Guidelines:

  • Name: You MUST use your official name you commonly use for work/business, not nickname or alias which cannot be easily recognized by co-workers, managers, or sponsors.
  • Email: You MUST use your email commonly used for work. It can be either your company email or your personal email (gmail) if you DO commonly use that personal email for business purpose.

To check if your author and email are configured correctly:

  $ git config user.name
  <your name>

  $ git config user.email
  <your email>

Alternatively, you can just type the following to list all your current git configuration variables and values, including name and email information.

  $ git config -l


To set your name and email:

  $ git config --global user.name "<Your Name>"
  $ git config --global user.email "<your@email.com>"

Commit messages

edit

It is important to have concise and accurate commit messages to help code reviewers do their work.

Latest requirements

Example commit message, excerpt from link

(Binary Analysis) SMT solver statistics; documentation

* Replaced the SMT class-wide number-of-calls statistic with a
  more flexible and extensible design that also tracks the amount
  of I/O between ROSE and the SMT solver.  The new method tracks
  statistics on a per-solver basis as well as a class-wide basis, and
  allows the statistics to be reset at arbitrary points by the user.

* More documentation for the new memory cell, memory state, and X86
  register state classes.
  • (Required) Summary: the first line of the commit message is a one line summary (<50 words) of the commit. Start the summary with a topic, enclosed in parentheses, to indicate the project, feature, bugfix, etc. that this commit represents.
  • (Optional) Use a bullet-list (using an asterisk, *) for each item to elaborate on the commit

Also see http://spheredev.org/wiki/Git_for_the_lazy#Writing_good_commit_messages.

Design Document

edit

Overview

edit

"The software design document is a written contract between you, your team, your project manager and your client. When you document your assumptions, decisions and risks, it gives the team members and stakeholders an opportunity to agree or to ask for clarifications and modifications. Once the software design document is approved by the appropriate parties, it becomes a baseline for limiting changes in the scope of the project." - How to Write a Software Design Document | eHow.com

We are still in the process of defining the requirements for design documents, but preliminarily, here are the initial rules for writing a design document for a ROSE module (an analysis, transformation, optimization, etc.).

(We thank Professor Vivek Sarkar at Rice University for his insightful comments for some of the initial design document requirements.)

Guideline

edit
  • All new ROSE analyses, transformations, and optimizations must have an accompanying design document, to be peer-reviewed, before the actual implementation begins.
  • Be specific enough that someone with ROSE skills who is not the original designer could (in principle) implement the design just by looking at the document.
  • It's to be expected that different developers will make different low-level choices about data structures, etc

Requirement vs. Design Document

edit

If the requirements document is the "why" of the software, then the technical design document is the "how to". For simplicity, we put both requirements and design into a single document for now. We allow a separated requirement analysis document if necessary.

The purpose of writing the technical design document is to guide developers in implementing (and fulfilling) the requirements of the software--it's the software's blueprint.

Format

edit

Documents must be:

  • Written in LaTex for re-usability in publications and proposals.
  • Stored under version control to support collaborative writing.

Your document should, at a minimum, include these formal sections:

  • Title page
  • Author information: who participates in the major writing
  • Reviewer information: who reviews and approves the document
  • Table of contents
  • Page numbering format
  • Section numbers
  • Revision history

Content

edit

Major Sections

  • Overview
    • Explain the motivation and goal of the module: what does this module do, the goal, the problem to address, etc.
  • Requirement analysis: what is required for this module
    • Define the interface: namespace, function names, parameters, return values. How others can call this module and obtain the returned results
    • Performance requirement: time and space complexity
    • Scope of input/test codes: what types of languages to be supported, the constructs of a language to be supported, the benchmarks to be used
  • Design considerations
    • Assumptions
    • Constraints
    • Tradeoffs and limitations: why this algorithm, what are the priorities, etc.
    • Non-standard elements: Definitions of any non-standard symbols, shapes, acronyms, and unique terms in the document
    • Game plan: How each requirement will be achieved
  • Internal software workflow
    • Diagrams: logical structure and logical processing steps: MUST have a UML diagram or power point diagram
    • Pseudo code: MUST have pseudo code to describe key data structures and high-level algorithm steps
    • Example: Must illustrate the designed algorithm by using at least one sample input code to go through the important intermediate results of the algorithm.
    • Error, alarm and warning messages, optional
  • Performance: MUST have complexity analysis. Estimate the time and space complexity of this module so users can know what to expect
  • Reliability (Optional)
  • Related work: cite relevant work in textbooks and papers

Development guidelines

edit
  • Coding guidelines: standards and conventions.
  • Standard languages and tools
  • Definitions of variables and a description of where they are used

References

edit

TODO

edit
  • a sample design document

Testing

edit

Rules

  • All contributions MUST have the accompanying test translator and input files to demonstrate the contributions work as expected.
  • All tests MUST be triggered by the "make check" rule
  • All test should have self-verification to make sure the correct results are generated
  • All tests MUST be activated by at least one of the integration tests of Jenkins (the test jobs used to check if something can be merged into our central repository's master branch)
    • This will ensure that no future commits can break your contributions.

Programming Languages

edit

Core Languages

edit

Only C++ is allowed. Any other programming language is an exception on a case-by-case basis.

Question: But Programming language XYZ is much better than C++ and I am really good at XYZ!!!

Answer: We can allow XYZ only if

  • You can teach at least one of old dogs (staff members) of our team the new tricks to efficiently use XYZ
  • You will be around in our team in the next 5 to 10 years to maintain all the code written in XYZ if none of the old dogs have time/interest to switch to XYZ
  • You can prove that XYZ can interact well with the existing C++ codes in ROSE

Scripting Languages

edit

Only two scripting languages are allowed

  • bash shell scripting
  • perl

Again, this is just a preference of the staff members and what we have now. Allowing uncontrolled number of scripting languages in a single project will make the project impossible to maintain and hard to learn.

Naming Conventions

edit

The order of sub-sections reflects a top-down approach for how things are added during the development cycle: from directory --> file --> namespace --> etc.

General

edit
  • Language: all names should be written in English since it is the preferred language for development, internationally
  • fileName; // NOT: filNavn

Abbreviations and Acronyms

edit

Avoid ambiguous abbreviations: obtain good balance between user-clarity and -productivity.

Abbreviations and acronyms should NOT be uppercase when used as name

  • exportHtmlSource(); // NOT: exportHTMLSource();
  • openDvdPlayer(); // NOT: openDVDPlayer();

Likewise, commonly-lowercase abbreviations and acronyms should NOT start with a lower-case letter when used in a CamelCase name:

  • SgAsmX86Instruction // NOT: SgAsmx86Instruction
  • myIpod // NOT: myiPod

File/Directory

edit

Case:

  • camelCase like fileName.hpp: This is consistent with existing names used in ROSE

File Extension:

  • Header files: .h or .hpp
  • Source files: .cpp or .cxx
    • .C should be avoided to work with file systems which do not distinguish between lower or upper case.

Namespaces

edit
  • A namespace should represent a logical unit, usually encapsulated in a single header file within a specific directory.
  • CamelCase for namespaces, such as SageInterface, SageBuilder, etc.
    • avoid lower case names, bad names: sage_interface
  • use singular for nouns within namespace names, avoid plural
  • use full words, avoid abbreviations
  • use at least two words to reduce name collision

Reason: the name convention of namespace is meant to be compatible with existing code and consistent with function names within namespaces.

  • CamelCase namespace can nice be used with doSomething() like: NameSpace::doSomething()
  • lower case namespace names may look inconsistent, such as name_space_1::doSomething()
  • many existing namespaces in ROSE already follow CamelCase, as shown at link

[Note] Leo: I believe this should be more discussed with ROSE Compiler Framework/ROSE API.

Types

edit

MUST be in mixed case starting with an uppercase letter, as in SavingsAccount

Variables

edit
  • Length: variables with a large scope should have long names, variables with a small scope can have short names
  • Temporary variables used for temporary storage (e.g. loop indices) are best kept short. A programmer reading such variables should be able to assume that its value is not used outside of a few lines of code. Common scratch variables for integers are i, j, k, m, n. Optionally, you can use ii, jj, kk, mm, and nn, which are easier to highlight when looking for indexing bugs.
  • Case: camelCase--mixed case starting with lowercase letter, as in functionDecl
    • Variables are purposely to start with lowercase letter as compared to upper case letter for Types. So it is clear by looking at the first letter to know if a name is a variable or a type.

Booleans

edit

Negated boolean variable names must be avoided. The problem arises when such a name is used in conjunction with the logical negation operator as this results in a double negative. It is not immediately apparent what !isNotFound means.

bool isError; // NOT: isNoError
bool isFound; // NOT: isNotFound

Collections

edit

Plural form should be used on names representing a collection of objects. This enhances readability since the name gives the user an immediate clue as to the type of the variable and the operations that can be performed on its elements.

For example,

vector<Point> points;
int values[];

Constants

edit

Named constants (including enumeration values): MUST be all uppercase using underscore to separate words.

For example:

int MAX_ITERATIONS, COLOR_RED;
double PI;

In general, the use of such constants should be minimized. In many cases implementing the value as a method is a better choice:

int getMaxIterations() // NOT: MAX_ITERATIONS = 25
{
    return 25;
}

Generic

edit

Generic variables should have the same name as their type. This reduces complexity by reducing the number of terms and names used. Also makes it easy to deduce the type given a variable name only. If for some reason this convention doesn't seem to fit it is a strong indication that the type name is badly chosen.

void setTopic(Topic* topic) // NOT: void setTopic(Topic* value)
                            // NOT: void setTopic(Topic* aTopic)
                            // NOT: void setTopic(Topic* t) 

void connect(Database* database) // NOT: void connect(Database* db)
                                 // NOT: void connect (Database* oracleDB)

Non-generic variables have a role. These variables can often be named by combining role and type:

Point  startingPoint, centerPoint;
Name   loginName;

Globals

edit

Must always be fully qualified, using the scope-resolution operator ::.

For example, ::mainWindow.open() and ::applicationContext.getName()

In general, the use of global variables should be avoided. Instead,

  • Place variable into a namespace
  • Use singleton objects

Private class variables

edit

Private class variables should have underscore suffix. Apart from its name and its type, the scope of a variable is its most important feature. Indicating class scope by using underscore makes it easy to distinguish class variables from local scratch variables.

For example,

class SomeClass {
  private:
    int length_;
}

An issue is whether the underscore should be added as a prefix or as a suffix. Both practices are commonly used, but the latter is recommended because it seem to best preserve the readability of the name. A side effect of the underscore naming convention is that it nicely resolves the problem of finding reasonable variable names for setter methods and constructors:

  void setDepth (int depth)
  {
    depth_ = depth;
  }

Methods and Functions

edit

Names representing methods or functions: MUST be verbs and written in mixed case starting with lower case to indicate what they return and procedures (void methods) after what they do.

  • e.g. getName(), computeTotalWidth(), isEmpty()

A method name should avoid duplicated object name.

  • e.g. line.getLength(); // NOT: line.getLineLength();

The latter seems natural in the class declaration, but proves superfluous in use, as shown in the example.

The terms get and set must be used where an attribute is accessed directly.

  • e.g: employee.getName(); employee.setName(name); matrix.getElement(2, 4); matrix.setElement(2, 4, value);

The term compute can be used in methods where something is computed.

  • e.g: valueSet->computeAverage(); matrix->computeInverse()

Give the reader the immediate clue that this is a potentially time-consuming operation, and if used repeatedly, he might consider caching the result. Consistent use of the term enhances readability.

The term find can be used in methods where something is looked up.

  • e.g.: vertex.findNearestVertex(); matrix.findMinElement();

Give the reader the immediate clue that this is a simple look up method with a minimum of computations involved. Consistent use of the term enhances readability.

The term initialize can be used where an object or a concept is established.

  • e.g: printer.initializeFontSet();

The american initialize should be preferred over the English initialise. Abbreviation init should be avoided.

The prefix is should be used for boolean variables and methods.

  • e.g: isSet, isVisible, isFinished, isFound, isOpen

There are a few alternatives to the is prefix that fit better in some situations. These are the has, can and should prefixes:

  • bool hasLicense();
  • bool canEvaluate();
  • bool shouldSort();

Parameters should be separated by a single space character, with no leading or trailing spaces in the parameters list:

  • YES: void foo(int x, int y)
  • NO: void foo ( int x,int y )

Directories

edit

Naming Convention

edit

List of common names

  • src: to put source files, headers
  • include: to put headers if you have many headers and don't want to put them all into ./src
  • tests: put test inputs
  • docs: detailed documentation not covered by README

Please use camelCase for your directory name.

  • you should avoid leading Capitalization

Examples of preferred names

  • roseExtensions
  • roseSupport
  • roseAPI

What to avoid

  • rose_api
  • rose_support

Layout

edit

TODO: big picture about where to put things within the ROSE git repository.


For each project directory under ./projects, it is our convention to have subdirectories for different files

  • README: must have this
  • ./src: for all your source files
  • ./include: for all your headers if you don't want to put them all into ./src
  • ./tests: for your test input files
  • ./doc: for your more extensive documentation if README is not enough

Files

edit

A single file should contain one logical unit, or feature. Keep it modular!

Naming Conventions

edit

A file name should be specific and descriptive about what it contains.

You should use camelCase (lowercase character in the beginning)

  • good example: fileName.h

What should be avoided

  • start with capitalization,
  • bad example using underscore: file_name.h

Bad file name

  • functions.h
  • file_name.h

References

Line Length

edit
  • File content should be kept within 80 columns.

80 columns is a common dimension for editors, terminal emulators, printers and debuggers, and files that are shared between several people should keep within these constraints. It improves readability when unintentional line breaks are avoided when passing a file between programmers. If you write a tutorial with more than 80 columns it is likely to not fit on the page. This effectively makes the tutorial useless without having to go into the code base itself.

Indentation

edit

Avoid tabs for your code indentation, except in cases where tabs (\t) are required, e.g. Makefiles.

2 or 4 spaces is recommended for code indentation.

for (i = 0; i < nElements; i++) 
  a[i] = 0;

Indentation of 1 is too small to emphasize the logical layout of the code. Indentation larger than 4 makes deeply nested code difficult to read and increases the chance that the lines must be split.

Characters

edit
  • Special characters like TAB and page break must be avoided.

These characters are bound to cause problem for editors, printers, terminal emulators or debuggers when used in a multi-programmer, multi-platform environment.

We already have a built-in perl script to enforce this policy.

Header Files

edit

File name:

  • must be camelCase: such as fileName.h or fileName.hpp
  • avoid file_name.h

Suffix

  • For C header files: Use .h
  • For C++ header files: Use .h or .hpp

Must have

  • protected preprocesssing directives to prevent the header from being included more than once, example
#ifndef _HEADER_FILE_X_H_
#define _HEADER_FILE_X_H_

#endif //_HEADER_FILE_X_H_
  • try to put your variables, functions, classes within a descriptive namespace.
  • Include statements must be located at the top of a file only.
    • Avoid unwanted compilation side effects by "hidden" include statements deep into a source file.

What to avoid in a header

  • global variables, functions, or classes ; // they will pollute the global scope
  • using namespace std;
    • this will pollute the global scope for each .cpp file which includes this header. using namespace should only be used by .cpp files. More explanations are at link and link2
  • function definitions
    • headers are meant to expose types and function interfaces. They will be included by multiple cpp files. A function definition in a header will cause re-definition error when compiling the multiple cpp files including it.


References:

Source Files

edit

Again, file names should follow the name convention

  • camelCase file name: e.g. sageInterface.cpp
  • Avoid capitalization, spaces, special characters

Preferred suffix

  • Use .c for C source files
  • Use .cpp or .cxx for C++ source files

What to avoid

  • capitalized .C for source files. This will cause some issue when porting ROSE to case-insensitive file systems.

References

README

edit

All major directories within ROSE git repository should have a README file

  • projects/projectXYZ MUST have a README file.

File name should be README

what to avoid

  • README.txt
  • readme

Required Content

edit

For all major directories in ROSE, there should be a README explaining

  • What is in this directory
  • What does this directory accomplish
  • Who added it and when

Each project directory must have a README to explain:

  • What this project is about
    • Name of the project
    • Motivation: Why do we have this project
    • Goal: What do we want to achieve
  • Design/Implementation: So next person can quickly catch up and contribute to this project
    • How do we design/implement it.
    • What is the major algorithm
  • Brief instructions about how to use the project
    • Installation
    • Testing
    • Or point out where to find the complete documentation
  • Status
    • What works
    • What doesn't work
  • Known limitations
  • References and citations: for the underlying algorithms
  • Authors and Dates

Format

edit

Format of README

  • text format with clear sections and bullets
  • optionally, you can use styles defined by w:Markdown

Examples

edit

An example README can be found at

Source Code Documentation

edit

The source code of ROSE is documented using the Doxygen documentation system.

General Guidelines

edit
  • English only
  • Use valid Doxygen syntax (see "Examples" below)
  • Make the code readable for a person who reads your code for the first time:
    • Document key concepts, algorithms, and functionalities
    • Cover your project, file, class/namespace, functions, and variables.
    • State your input and output clearly, specifically the meaning of the input or output
      • Users are more likely to use your code if they don't have to think about what the output means or what the input should be
    • Clever is often synonymous with obfuscated, avoid this form of cleverness in coding.

TODO, not ready yet

  • Test your documentation by generating it on your machine and then manually inspecting it to confirm its correctness

TODO: Generating Local Documentation

This does not work sometimes since we have a configuration file to indicate which directories to be scanned to generate the web reference html files

  $ make doxygen_docs -C ${ROSE_BUILD}/docs/Rose/

Use //TODO

edit

This is a recommended way to improve your code's comments.

While doing incremental development, it is often to have something you decide to do in the next iterations or you know your current implementation/functions have some limitations to be fixed in the future.

A good way is to immediately put a TODO source comments (// TODO blar blar ..) into the relevant code when you make such kind of decisions so you won't forget here is something you want to do next time.

The TODOs also serve as some handy flags within the code for other people if they want to improve your work after you are gone.

Examples

edit

Single Line

edit

Often a brief single line comment is enough

//! Brief description.

Multiple lines

edit

Doxygen supports comments with multiple lines.

/**
 
   ... text..
 
 */

/**
 *
 *  ... text..
 *
 */


/*******************************//**
 *         text
*********************************/

/////////////////////////////////////
///  ... text <= 80 columns in length
//////////////////////////////////////

Combined single line and multiple lines

edit

Doxygen can generate a brief comment for a function and optionally show detailed comments if users click on the function.

Here are the options to support combined single-line and multiple-line source comments.

Option 1:

/**
 * \brief Brief description.
 *        Brief description continued.
 *
 * [Optional detailed description starts here.]
 */

Option 2:

/**
 \brief Brief description.
        Brief description continued.
 
 [Optional detailed description starts here.]
 */

---

Single line comment followed by multiple line comments':

You may extend an existing single line comment with a multiple line comments (Option 1 or 2). For example:

//! Brief description.
/**
 * Detailed description starts here.
 */


TODO: provide a full, combined example.

Functions

edit

Rules

  • Except for simple functions like getXX() and setXX(), all other functions should have at least one line comment to explain what it does
  • Avoid global functions and global variables. Try to put them into a namespace.
  • A function should not have more than 100 lines of code. Please refactor big functions into smaller, separated functions.
  • Limit the unconditional printf() so your translator will not print hundreds lines of unnecessary text output when processing multiple input files
    • Use an if condition to control printf() for debugging purposes such as " if ( SgProject::get_verbose() > 0 ) "
  • The beginning part of the function should try to do sanity check for the function parameters.

Comments

edit

Rules

  • Please follow Doxygen style comments
  • Please explain in sufficient detail how your function works and the steps in the algorithm.
    • Reviewers will read your commented information to understand your algorithm and then read your code to see if the code implements the algorithm correctly and efficiently.

Coding

edit

Correctly implement the designed/documented algorithms. Future users won't have time to read your code directly to discern what it does.

Code should be efficient in terms of both time and space (memory) complexity.

Please be aware that your translator may handle thousands of statements with even more AST nodes.

Be aware that people other than you may use your code or develop it further. Please make this as easy as possible.

Classes

edit

Try to use namespace when possible, avoid global variables or classes.

Name Equals Functionality

edit

Name the class after what it is. If you can't think of what it is that is a clue you have not thought through the design well enough.

  • A class name should be a noun.

Compound names of over three words are a clue your design may be confusing various entities in your system. Revisit your design. Try a CRC card session to see if your objects have more responsibilities than they should.

Explicit Access

edit

All sections (public, protected, private) should be identified explicitly. Not applicable sections should be left out.

Public Members First

edit

The parts of a class should be sorted public, protected and private.

The ordering is "most public first" so people who only wish to use the class can stop reading when they reach the protected/private sections.

Class Variables

edit

Class variables should NOT be declared public.

The concept of C++ information hiding and encapsulation is violated by public variables. Use private variables and access functions instead. One exception to this rule is when the class is essentially a data structure, with no behavior (equivalent to a C struct). In this case it is appropriate to make the class' instance variables public.

Avoid Structs

edit

Structs are kept in C++ for compatibility with C only, and avoiding them increases the readability of the code by reducing the number of constructs used. Use a class instead.

Statements

edit

Loops

edit

Only loop control statements may be included in the for() construction, nothing else is allowed.

//Correct
sum = 0; 
for (i = 0; i < 100; i++) 
  sum += value[i]; sum += value[i];

//Incorrect
 for (i = 0, sum = 0; i < 100; i++) 

This increases maintainability and readability. It also allows future developers to make a clear distinction of what controls and what is contained in the loop.

Loop variables should be initialized immediately before the loop.

Type Conversions

edit

Type conversions must always be done explicitly. Never rely on implicit type conversion.

  //Correct
  floatValue = static_cast<float>(intValue); 
  //Incorrect 
  floatValue = intValue;

By this, the programmer indicates that he is aware of the different types involved and that the mix is intentional.

Conditionals

edit

The body of a conditional must be put on a separate line.

 if (isDone) 
 // NOT: if (isDone) doCleanup(); doCleanup();

This is for debugging purposes. When writing on a single line, it is not apparent whether the test is really true or not.

There must be a space separating the keyword if from the condition statement (isDone).

if (isDone)
  ^ space

Complex conditional expressions must be avoided. You must introduce temporary boolean variables instead

//recommended way
bool isFinished = (elementNo < 0) || (elementNo > maxElement); 
bool isRepeatedEntry = elementNo == lastElement; 
if (isFinished || isRepeatedEntry) { : } 

// NOT: if ((elementNo < 0) || (elementNo > maxElement)|| elementNo == lastElement) { : }

By assigning boolean variables to expressions, the program gets automatic documentation. The construction will be easier to read, debug and maintain. When the variables are well named, it also helps future developers understand what each part of the construction is accomplishing.

printf and cout

edit

All screen output MUST be put into a if statement to be conditionally executed, either via verbose level or other debugging option.

They MUST not print out information by default.

TODO: this can be enforced by a simple Compass checker in the future.

switch

edit

Carefully differentiate

  • things which are known to be allowed to ignore and
  • things which are not yet handled by the current implementation.
  switch(type->variantT())
 {
    case V_SgTypeDouble:
      {
        ...
      }
      break;
    case V_SgTypeInt:
      {
        ...
      }
      break;
   case V_SgTypeFloat: // things which are known to be allowed to be ignored.
      break;
   default:
    {
     //Things which are not yet explicitly handled
      cerr<<"warning, unhandled node type: "<< type->class_name()<<endl;
    }

assert

edit

It is encouraged to use assert often to explicitly express and guarantee assumptions used in the code.

Please use ROSE_ASSERT() or assert().

For each occurrence of assertion, you MUST add a printf or cerr message to indicate where in the code and what goes wrong so users can immediately know the cause of the assertion failure, without going through a debugger to find out what went wrong.

Statements To Be Avoided

edit

The following statements should usually be avoided:

  • Goto statements should not be used. Goto statements violate the idea of structured code. There are very few cases (for instance breaking out of deeply nested structures) where goto should be considered, and only if the equivalent structured counterpart is less readable.
  • Executable statements in conditionals should be avoided. Conditionals with executable statements are very difficult to read.
  File* fileHandle = open(fileName, "w"); 
  if (!fileHandle) { : } 
  // NOT: if (!(fileHandle = open(fileName, "w"))) { : }

Expressions

edit

Guidelines for readability, simplicity and debuggability.

  • Ternary operators (?:) should be replaced with if/else.
  • Long expressions should be broken up into several simpler statements. Add assertion for each pointer value obtained along the process to assist later debugging.
  • Clever use of operator precedence, shortcut evaluation, assignment expressions, etc. should be rewritten to easy-to-understand alternative forms.
  • Always remember that future programmers will appreciate clear and simple code rather than obfuscated cleverness.

AST Translators

edit

All ROSE-based translators should call AstTests::runAllTests(project) after all the transformation is done to make sure the translated AST is correct.

This has a higher standard than just correctly unparsed to compilable code. It is common for an AST to go through unparsing correctly but fail on the sanity check.

More information is at Sanity_check

References

edit

We list some external resources which are influential for us to define ROSE's coding standard