Rebol Programming/Advanced/Interpreter

Author: Ladislav Mecir

Interpretation phases edit

The interpretation of Rebol source code (a text contained in a text file, a text contained in a Rebol string, a text typed in from the keyboard, etc.), consists from three basic steps:

  1. make phase
  2. load phase
  3. do phase

MAKE phase edit

In this phase the make function creates a Rebol block.

The created block is filled by make to refer to the Rebol values received by parsing the supplied source text according to the rules for the corresponding Rebol datatypes.

All words (i.e. all values of any-word! datatype) referenced by the new block are unbound, i.e. they have no context information. For more details on contexts see Bindology essay.

LOAD phase edit

In this phase the load function extends the global context to contain all words (values of the any-word! datatype) referenced by the block made in the previous phase, with the exception of refinements.

All words (except for refinements) in the created blocks are replaced by their global context counterparts.

That is why all words contained in a load result block are global words (except for the refinements, as was stated above).

DO phase edit

For the do function is immaterial whether the interpreted block is a result of the previous phase or a result of any other operation. This model describes the behaviour of the do function processing Rebol blocks and parens in all situations.

The do function processes the values contained in the interpreted block one by one in a way we call implicit evaluation.

Simple implicit evaluation edit

For some values, the result of implicit evaluation is the encountered value itself. Examples are integral values, decimal values, characters, strings and blocks.

Implicit evaluation of parens edit

As opposed to blocks that implicitly evaluate to themselves, parens, when encountered in an interpreted block, are interpreted further in the same way as the do function interprets blocks.

Implicit evaluation of functions edit

When any-function! datype values are encountered, it is necessary to collect their arguments and supply them to the evaluated functions.

Specifically, if the do function is evaluated, it evaluates its argument in a way that we call explicit evaluation. In particular, if the argument is a block, the do function interprets it recursively as described in the "DO phase" section. Therefore, for a block, the explicit evaluation differs from the implicit evaluation. This is what makes Rebol special.

Implicit evaluation of words edit

When the implicitly evaluated value is a word having the word! datatype, it is treated as a variable and the value of the variable is looked up and treated depending on its datatype. We call this indirect implicit evaluation.

There is also an indirect explicit evaluation, which happens if a word or a path is supplied as an argument to the do function.

The result of the interpretation edit

When the do function finishes the implicit evaluation of all values contained in the block, the last obtained value becomes the result of the interpretation of the block.

Exception: if the do function obtains an error! datatype value and the value isn't used as an argument to a function accepting error values, the interpreter causes the error.

In my opinion, this exception is unnecessary and can be painlessly avoided in the future versions of the interpreter.

Simulation (simple Rebol interpreter written in Rebol) edit

The simulation function below is a simple interpreter able to interpret one line of text you input from the keyboard.

  simulation: func [/local input-line created-block loaded-block] [
      ; step #0, INPUT
      input-line: ask ">> "
      ; step #1, MAKE
      created-block: make block! input-line
      ; step #2, LOAD
      loaded-block: load created-block
      ; step #3, DO
      do loaded-block
      ; done
  ]

It illustrates the description written above.

Evaluation of Rebol words edit

Rebol words (the values of the word! datatype) exhibit the most complicated behaviour. When a word is evaluated by the do function, the do function first picks the value the word refers to. If the evaluated word has no context, the do function is unable to pick the value and it causes an error instead:

>> b: make block! "a" ; == [a]
== [a]
>> do b
** Script Error: a word has no context
** Near: a

The next action depends on the datatype of the picked value. A special case is, when the word is unset, i.e. when the value of the word is of the unset! datatype:

>> do [a]
** Script Error: a has no value
** Near: a

For some values no further action is needed and the picked value becomes the result of the word evaluation, for other values the action is taken in accordance with the datatype. The values of the second kind we could call word-active values, or word-active datatypes. The recent versions of the interpreter use decreasing number of word-active datatypes.

Typical representants of the word-active values are any-function! values. When such a value is encountered as a value of an evaluated word, the do function collects all the arguments for the function and "calls the function" like above.

The shrinking list of the word-active datatypes still contains lit-words and lit-paths (unnecessarily, in my opinion). On the other hand, the behaviour of unset! values could be made more function-like too.

   word-active?: func [
       {finds out, if a Rebol value is word-active}
       value [any-type!]
   ] [
       parse head insert/only copy [] get/any 'value [
           unset! | any-function! | lit-word! | lit-path! 
       ]
   ]

Recursive behaviour edit

The word-active values can exhibit recursive behaviour like:

   ; recursive function
   factorial: func [n] [
       either n <= 1 [1] [n * factorial n - 1]
   ]
>> factorial 5
== 120

In fact, we can make even not-active values to exhibit recursive behaviour, if we use active values appropriately:

   ; recursive block
   factorial-block: [
       either n <= 1 [1] [n * (n: n - 1 do factorial-block)]
   ]
>> n: 5 do factorial-block
== 120