Naming identifiers

edit

C++'s restriction about the names of identifiers and its keywords have already been covered, on the Code Section. They leave a lot of freedom in naming, one could use specific prefixes or suffixes, start names with an initial upper or lower case letter, keep all the letters in a single case or, with compound words, use a word separator character like "_" or flip the case of the first letter of each component word.

Note:
It is also important to remember to avoid collisions with the OS's APIs (depending on the portability requirements) or other standards. For instance POSIX's keywords terminate in "_t".

Hungarian notation
edit

Hungarian notation, now also referred to as Apps Hungarian, was invented by Charles Simonyi (a programmer who worked at Xerox PARC circa 1972-1981, and who later became Chief Architect at Microsoft); and has been until recently the preeminent naming convention used in most Microsoft code. It uses prefixes (like "m_" to indicate member variables and "p" to indicate pointers), while the rest of the identifier is normally written out using some form of mixed capitals. We mention this convention because you will very probably find it in use, even more probable if you do any programming in Windows, if you are interested on learning more you can check Wikipedia's entry on this notation.

This notation is considered outdated, since it is highly prone to errors and requires some effort to maintain without any real benefit in today's IDEs. Today refactoring is an everyday task, the IDEs have evolved to provide help with identifier pop-ups and the use of color schemes. All these informational aids reduce the need for this notation.

Leading underscores
edit

In most contexts, leading underscores are better avoided. They are reserved for the compiler or internal variables of a library, and can make your code less portable and more difficult to maintain. Those variables can also be stripped from a library (i.e. the variable is not accessible anymore, it is hidden from external world) so unless you want to override an internal variable of a library, do not do it.

Reusing existing names
edit

Do not use the names of standard library functions and objects for your identifiers as these names are considered reserved words and programs may become difficult to understand when used in unexpected ways.

Sensible names
edit

Always use good, unabbreviated, correctly-spelled meaningful names.

Prefer the English language (since C++ and most libraries already use English) and avoid short cryptic names. This will make it easier to read and to type a name without having to look it up.

Note:
It is acceptable to ignore this rule for loop variables and variables used within a small scope (~20 lines), they may be given short names to save space if the purpose of that variable is obvious enough. Historically the most commonly used variable name in this cases is "i".

The "i" may derive from the word "increment" or "index". The "i" is very commonly found in for loops that does fit nicely the specification for the use of such variable names.

In early Fortran compilers (upto F77), variables beginning with the letters i through n implicitly represented integers - and by convention the first few (i, j, k) were often used as loop counters.

Names indicate purpose
edit

An identifier should indicate the function of the variable/function/etc. that it represents, e.g. foobar is probably not a good name for a variable storing the age of a person.

Identifier names should also be descriptive. n might not be a good name for a global variable representing the number of employees. However, a good medium between long names and lots of typing has to be found. Therefore, this rule can be relaxed for variables that are used in a small scope or context. Many programmers prefer short variables (such as i) as loop iterators.

Capitalization
edit

Conventionally, variable names start with a lower case character. In identifiers which contain more than one natural language words, either underscores or capitalization is used to delimit the words, e.g. num_chars (K&R style) or numChars (Java style). It is recommended that you pick one notation and do not mix them within one project.

Constants
edit

When naming #defines, constant variables, enum constants. and macros put in all uppercase using '_' separators; this makes it very clear that the value is not alterable and in the case of macros, makes it clear that you are using a construct that requires care.

Note:
There is a large school of thought that names LIKE_THIS should be used only for macros, so that the name space used for macros (which do not respect C++ scopes) does not overlap with the name space used for other identifiers. As is usual in C++ naming conventions, there is not a single universally agreed standard. The most important thing is usually to be consistent.

Functions and member functions
edit

The name given to functions and member functions should be descriptive and make it clear what it does. Since usually functions and member functions perform actions, the best name choices typically contain a mix of verbs and nouns in them such as CheckForErrors() instead of ErrorCheck() and dump_data_to_file() instead of data_file(). Clear and descriptive names for functions and member functions can sometimes make guessing correctly what functions and member functions do easier, aiding in making code more self documenting. By following this and other naming conventions programs can be read more naturally.

People seem to have very different intuitions when using names containing abbreviations. It is best to settle on one strategy so the names are absolutely predictable. Take for example NetworkABCKey. Notice how the C from ABC and K from key are confused. Some people do not mind this and others just hate it so you'll find different policies in different code so you never know what to call something.

Prefixes and suffixes are sometimes useful:

  • Min - to mean the minimum value something can have.
  • Max - to mean the maximum value something can have.
  • Cnt - the current count of something.
  • Count - the current count of something.
  • Num - the current number of something.
  • Key - key value.
  • Hash - hash value.
  • Size - the current size of something.
  • Len - the current length of something.
  • Pos - the current position of something.
  • Limit - the current limit of something.
  • Is - asking if something is true.
  • Not - asking if something is not true.
  • Has - asking if something has a specific value, attribute or property.
  • Can - asking if something can be done.
  • Get - get a value.
  • Set - set a value.
Examples
edit

In most contexts, leading underscores are also better avoided. For example, these are valid identifiers:

  • i loop value
  • numberOfCharacters number of characters
  • number_of_chars number of characters
  • num_chars number of characters
  • get_number_of_characters() get the number of characters
  • get_number_of_chars() get the number of characters
  • is_character_limit() is this the character limit?
  • is_char_limit() is this the character limit?
  • character_max() maximum number of a character
  • charMax() maximum number of a character
  • CharMin() minimum number of a character

These are also valid identifiers but can you tell what they mean?:

  • num1
  • do_this()
  • g()
  • hxq

The following are valid identifiers but better avoided:

  • _num as it could be used by the compiler/system headers
  • num__chars as it could be used by the compiler/system headers
  • main as there is potential for confusion
  • cout as there is potential for confusion

The following are not valid identifiers:

  • if as it is a keyword
  • 4nums as it starts with a digit
  • number of characters as spaces are not allowed within an identifier