Forth/Parsing in FORTH, or "What happened to the Compiler"

To understand the FORTH language we can start by examining what FORTH does when you type a command line to it. Here is a formal definition of the cycle of "interpretation" ("do this now") or "compilation ("do this later"). While it may be difficult to see the forest for the trees, let's briefly describe the whole process, and then go back through it and explain the significance in greater detail.

FORTH has simplified the process of human-to-machine commands by making it very uniform. To give your instructions to a FORTH system, a series of words is typed to the computer. FORTH then "reads" these words from first to last. To tell where one word ends and another begins, a simple rule is used: the words are separated by one or more ASCII blank characters (code 32). Each word is processed in order typed (or read in from a source file).

As each incoming word is broken out of the input stream, FORTH will determine whether it already has a definition for it in its "dictionary". FORTH refers to its symbol table, arranged as part of a "dictionary" structure. A match to the name of the word is searched for in the name field of each dictionary entry (starting with the most recently defined word and ending with the oldest word in the dictionary). If the word is not found in the dictionary, FORTH considers whether the input word could be interpreted as a number in the present base. If not, the input is incomprehensible to FORTH, and it rejects it with a message indicating no such word can be found in the dictionary.

If a definition was found or the input is a recognizable number, FORTH considers whether it is supposed to do this command now or later. If the answer is "now" ("interpretation"), it executes the definition of the word associated with its dictionary entry, or pushes the number atop FORTH's central switchyard for data, the stack. If the answer is "later" ("compilation") it usually means that we are in the process of adding a new word to the dictionary to become the most recently defined word.

Accordingly, we place a reference to the input word or number as the next entry in the definition of the word being defined. It will execute in its turn later, but only when someone requests the execution of the new word we are in the process of defining.

Then we repeat the above process to parse the next word in the input stream. This continues until there isn't any more input.

What are the implications of the above process? How does it compare with other approaches? (Note: the terms "word" and "subroutine" are used interchangeably in the following section to emphasize the parallel.)

  • Accessibility:

Any word in the dictionary can be called from the command line. So every word in a dictionary and the applications it contains can be individually tested without further ado. Just call the word in question from the command line. Find a bug? Unlike traditional toolchains, you are not required to recompile the application to include custom debug oriented calls to internal subroutines to get visibility. Just call the failing word from the command line. Fails? Look at its definition. Type each of these words in order to the command line. Does any one of these fail. Repeat. The problem can be isolated without a need to take special measures to make internal subroutines callable.

  • Vocabularies.

This does not mean that every internal word needs to be available to the customer at all times, including dangerous ones that e.g. erase their files. Groups of words can be organized in vocabulary lists, and particular vocabulary lists made available to particular command lines.

  • Natural discouragement of circular definitions and thinking: Words must be added to the dictionary in a sensible natural order. When building the definition of a word, if one calls for the use of a word that is not yet in the dictionary, one gets a "WORD NOT IN DICTIONARY" error. So one must organize a series of words starting with the foundation words that rely only on themselves, followed by a layer of words that rely on them, followed by the words that rely on them. This helps structure the dictionary in a logical manner.
  • Simplifying and combining the compile and load parts of the toolchain:

Traditional toolchains have distinct compile and load phases. Compilers place object code modules into libraries that are later loaded and linked, either in a static or dynamic manner. Only then are missing subroutine reports generated. FORTH simplifies and combines these steps. It is as if each library module is loaded and given a fixed address in memory. Once its address is established, the modules that call it can be loaded (later words in dictionary). These calling words then can point directly to the memory address of the entry of the word they wish to call, compared with a traditional linkers external reference process.

  • The distinction between source and object code evaporates.

It turns out this was unnecessary baggage of the other computer languages and may not be a useful concept after all. In this language, we see a small overhead to define a new word, (store its name, attach it to the rest of the dictionary, and other wrapper overhead). Compiling the "source code" definition of the word into the "object code" definition amazingly is no more than taking the name of each word in ASCII found in the source text and placing the corresponding memory address of its entry point into the executable "object code". This process is easily reversible. Compilation turns out to be removing the human readable comments found in the source code (machine doesn't need them in object code, it can't read them) and converting the human readable name of each word called into the corresponding numerical machine address that the hardware uses. The "compiler" as a mystified mountain of megabytes of exception case riddled code fighting your brain is no longer. Just a few well defined lookups. What was all that other compiler scholasticism actually useful for?

  • COMPILATION, in this system, means adding (a new word) to the top of the dictionary. To do this, each called wordname is translated from the ASCII name found in the source text to the corresponding memory addresses of its entry point. This number is added to the end of the "object code" definition of the word.