Programming Fundamentals/Identifier Names

Overview

edit

Within programming, a variety of items are given descriptive names to make the code more meaningful to the programmer. These are called "Identifier Names." When an item is declared or defined, it is identified by a name. Some examples of items that can be named are constants, variables, type definitions, and functions. These names help identify the function of the item. These names follow a set of rules that are imposed by:

  1. the language's technical limitations
  2. good programming practices
  3. common industry standards for the language

Discussion

edit

Language Technical Limitations

edit
  • Must use only allowable characters: in many languages, the first character must be alphabetic or an underscore. The next character can be alphanumeric or an underscore
  • Can't use reserved words
  • Length limit

These attributes vary from one programming language to another. The allowable characters and reserved words will be different. The length limit refers to how many characters are allowed in an identifier name and is often compiler dependent and may vary from compiler to compiler for the same language. However, all programming languages have some form of the technical rules listed here.

Good Programming Techniques

edit
  • Meaningful
  • Be case consistent

Meaningful identifier names make your code easier to understand for other programmers. After all, what does “p” mean? Obviously, "p" could stand for anything. Because of this, avoid abbreviations and don't use cryptic, or hard to understand identifier names.

Some programming languages treat upper and lower case letters used in identifier names as the same (ex. pig and Pig are treated as the same identifier name). The compiler usually changes all identifier names to upper case (ex. pig and Pig are now both changed to PIG), unbeknownst to the programmer. However, not all programming languages act this way. Some will treat upper and lower case letters as being different things (ex. pig and Pig are two different identifier names). If you declare it as pig and then reference it in your code later as Pig, you will get a different variable or perhaps a compiler error. To avoid this problem altogether, we teach students to be case consistent. Use an identifier name only one way and spell it (upper and lower case) the same way every time within your program.

Industry Rules

edit

Almost all programming languages and most coding shops have a standard code formatting style guide programmers are expected to follow. Among these are three common identifier casing standards:

  • camelCase – each word is capitalized except the first word, with no intervening spaces
  • PascalCase – each word is capitalized including the first word, with no intervening spaces
  • snake_case – each word is lowercase with underscores separating words

C++, Java, and JavaScript typically use camelCase, with PascalCase reserved for libraries and classes. C# uses primarily PascalCase with camelCase parameters. Python uses snake_case for most identifiers. In addition, the following rules apply:

  • Do not start with an underscore (used for technical programming)
  • CONSTANTS IN ALL UPPER CASE (often UPPER_SNAKE_CASE)

These rules are decided on by the industry (those who are using the programming language).

Key Terms

edit
Camel case
The practice of writing compound words or phrases such that each word or abbreviation in the middle of the phrase begins with a capital letter, with no intervening spaces or punctuation.
Pascal case
The practice of writing compound words or phrases such that each word or abbreviation in the phrase begins with a capital letter, including the first letter, with no intervening spaces or punctuation.
Reserved word
Words that cannot be used by the programmer as identifier names because they already have a specific meaning within the programming language. (ie. if, then, else, while, for and case)
Snake case
The practice of writing compound words or phrases in which the elements are separated with one underscore character (_) and no spaces, with each element’s initial letter usually lowercased within the compound and the first letter either upper or lower case.

References

edit