Scheme Programming/Macros
Macros are said to be what makes Lisp Lisp. Other programming languages, notably C and C++, have macros, but they're not like Lisp macros. In C, the macro preprocessor allows you to do limited textual substitution on snippets of C code. Sometimes, this produces working C code, while other times it can result in invisible misplaced braces or semicolons that the compiler can't quite pinpoint.
Lisp macros are full-blown Lisp procedures, with the full power of any other Lisp procedure. Instead of text, they get lists that represent the bits of code that you want to change. The return value of a macro is a list representing the new program. Scheme supports three types of macro. When it comes to understanding what's going on, the easiest of these three types of macro is Lisp's original type of macro, known as defmacro
in Common Lisp, EMACS Lisp, and SCM. Other Scheme implementations call this form define-macro
, while still other implementations, notably Racket, reject it as being too primitive.
To use macros in SCM, you must first evaluate this form:
(require 'macro)
Other Scheme implementations generally do not require this step.
A simple macro can illustrate what's going on. We want it to do something that would be impossible to do with a procedure. So we'll implement a macro that works like a `begin` form, but runs the code within it in reverse order:
(defmacro (backwards . body)
(cons 'begin
(reverse body)))
Every time you write this:
(backwards
(newline)
(display 3)
(newline)
(display 2)
(newline)
(display 1))
...the macro code above gets called at compile time, and the code inside the backwards
block is passed as the body
argument. The
backwards
macro reverses this code and conses the symbol begin
to the beginning of the result, creating a begin
form:
(begin
(display 1)
(newline)
(display 2)
(newline)
(display 3)
(newline))
Your program will execute as if that begin form was what you wrote in your program. Macros can do absolutely anything to the code in their arguments. Here is a more useful macro:
(defmacro (while condition . body)
`(let loop ()
(cond (,condition
(begin . ,body)
(loop)))))
Then you can write this program:
(define counter 10)
(while (> counter 0)
(display counter)
(newline)
(set! counter (- counter 1)))
The while loop checks if the conditional expression is true, and if it is, it executes the body over and over until the
condition becomes false. In the above code, the condition becomes false when counter
reaches a value of 0.
There is just one small problem with defmacro
. The loop
procedure that our while loop uses for recursion is visible to
the code in the body when you use a while loop. You could set it to something else, and then the loop would break. You might do this accidentally:
> (while (< counter 10)
(display counter)
(newline)
(set! loop 'oops))
0
;ERROR: Wrong type to apply: oops
; in expression: (loop)
; in scope:
; () procedure loop
; (loop . #@let)
;STACK TRACE
1; ((#@cond ((#@< #@counter 10) (#@display #@counter) (#@newline) ...
2; ((#@let ((loop (#@lambda () (#@cond ((#@< #@counter 10) (#@dis ...
Lisp provides a solution to this, called gensym
in Common Lisp and most Scheme implementations. However, in SCM, this procedure is called gentemp
. It generates a symbol that is guaranteed not to appear anywhere else in the program. You can use it to generate a name to use instead
of loop
. Fixing the while macro, it now looks like this:
(defmacro (while condition . body)
(let ((loop (gentemp))) ; gensym if you're not using SCM.
`(let ,loop ()
(cond (,condition
(begin . ,body)
(,loop))))))
Hygienic Macros: syntax-rules
editScheme introduces hygienic macros to the Lisp world. When writing this kind of macro, it is impossible to accidentally introduce variable names that the code you're altering can change. However, to write them, we must abandon the concept of code being made out of simple lists.
There are two kinds of hygienic macros, but only the less powerful of the two is supported by SCM. Syntax-rules macros rely on pattern-matching to define the valid ways to invoke the macro, and templates to define the output code. There are some macros that cannot be defined with syntax-rules
. Here's the while loop again, defined with syntax-rules
instead of defmacro
:
(define-syntax while
(syntax-rules ()
((_ condition . body)
(let loop ()
(cond (condition
(begin . body)
(loop)))))))
If you refer to loop
within the body if this version of while
, it refers to a different loop
from the one shown in the template. It is impossible to define a variable with syntax-rules
that code from outside the macro can see.
The form that says (_ condition . body)
is the pattern. _
is a wildcard. Anything can match it, and its value is not bound to anything. In this case, it is standing in the place where the word while
goes.
condition
and body
are pattern variables. They're not real variables. They won't have value at runtime, only within the template. Due to dotted-list notation, body
is matched in the cdr
position, which means it's all the forms that come after the body
.
Scheme's cond
form can be implemented with this macro system. In most implementations, cond
is in fact a macro. A typical cond
definition looks like this:
(define-syntax my-cond
(syntax-rules (else)
((_) (if #f 'not-reached))
((_ (else . body))
(begin . body))
((_ (bool-expression . body) . rest)
(if bool-expression
(begin . body)
(cond . rest)))))
my-cond
works like Racket's version of cond
in that (cond)
is legal and has an unspecified return value, and unlike SCM's cond
in which (cond)
is illegal. The (else)
at the top in parentheses is the list of literals. They have to match exactly in the code in
order to match the pattern. In other words, where you see else
in the pattern, it only matches if the word else
is found there.
Now let's make a REAL macro
editWhat if you could pattern-match regular lists at runtime in the same manner that you can pattern-match syntax at compile-time when using syntax-rules
? Then, if you had a list and you wanted to capture the first two elements (or do something else if it wasn't really a list or if it didn't have two elements), you could write something like this:
(match some-value (literally-hitler)
((literally-hitler . rest) ; First element is literally Hitler.
(error "Found the Nazi"))
(((a b) second . rest) ; First element is a two-element list.
(display a))
((first second . rest) ; It's a list with at least two elements.
(display (list first second)))
(else #f))
The macro supports literals just like syntax-rules
. The only thing that's not supported is the wildcard operator. Every value matched
will be bound to something.
We'll use syntax-rules
because in the end, the pattern matching provided by it does more to make this job easier than does
the Turing completeness of defmacro
. When we're done, we'll be able to use the same kind of pattern matching in defmacro
macros as well, by using this macro within its body. The only thing we'll be missing in defmacro
then will be hygiene.
How pattern matching works
editImplementing something like this using syntax-rules
will require two macros. One macro will match a single pattern, and evaluate
an expression if it succeeds, or return a failure value if it fails.
The match-pattern
macro requires five arguments:
(match-pattern pattern literals match-value fail-value success-expr)
The pattern
contains the names of variables in the positions you hope to find them in some candidate structure. So
if the pattern is (a . b)
, for example, then you're trying to bind a
to the car
of a list,
and b
to its cdr
. If a
and b
are not literals, then this pattern can be
matched against match-value
by generating the following code:
(let ((hd (maybe-car match-value fail-value)))
(if (eq? hd fail-value)
fail-value
(match-pattern tl literals (maybe-cdr match-value fail-value)
fail-value success-expr))))
maybe-car
is a special version of car
that won't raise an error if it's called on a non-pair. Instead of assuming that
match-value
will be a list, maybe-car
makes it possible to check, without us having to write (if (pair? match-value) ...)
every time.
If the car
of the list matches, the match-pattern
macro is used again on both the cdr
of the data and the cdr
of the pattern (which is tl
in the code snippet above). This way, tl
can be either another pattern, or a variable.
If maybe-car
or maybe-cdr
encounter something that isn't a list, they return fail-value
, which we then check for and return if it's found. That way, if any call to maybe-car
or maybe-cdr
evaluates to fail-value
, then the entire match-pattern
also evaluates to fail-value
.
The macro will use exists-in?
from the Looping chapter to implement
the literals. If an identifier in the pattern
is found in the literals
, then that identifier is interpreted as a symbol, and that position in the structure must be that identifier or fail-value
will be returned. The failure-value
will be provided from outside the macro. Ultimately, we'll generate a failure value at runtime
using (gentemp)
(or (gensym)
) on any Scheme implementation except SCM).
(require 'macro)
(define (maybe-car obj fail-value)
(if (pair? obj)
(car obj)
fail-value))
(define (maybe-cdr obj fail-value)
(if (pair? obj)
(cdr obj)
fail-value))
(define exists-in?
(lambda (ele lis)
(cond ((null? lis) #f)
((equal? ele (car lis)) #t)
(else (exists-in? ele (cdr lis))))))
(define-syntax match-pattern
(syntax-rules ()
;; No pattern. Matches if the match-value is null.
((_ () literals match-value fail-value success-expr)
(if (null? match-value)
success-expr
fail-value))
;; Notice there are TWO pattern-matches going on: One at compile-time via
;; syntax-rules, and another at runtime, being done with cond forms
;; and comparison with the 'fail-value to detect failures deeper in the
;; pattern.
;;
;; This case matches when the first element of the pattern is a list.
;; It generates code that matches the match-value only if its first element
;; is also a list.
((_ ((hhd . htl) . tl) literals match-value fail-value success-expr)
(cond ((eq? match-value fail-value)
fail-value)
;; Macros are allowed to expand into instances of themselves.
(else (match-pattern (hhd . htl) literals (maybe-car match-value fail-value)
fail-value
(match-pattern tl literals (maybe-cdr match-value fail-value) fail-value
success-expr)))))
;; Matches if the pattern itself is a list. hd, short for "head", is a
;; variable that will be bound to the first element of the match-value if it's
;; a list. If it's not a list, (maybe-car) will cause hd to be bound to the fail-value.
;;
;; Also, the match-value may already be the fail-value due to occurrences at a shallower
;; level in the pattern. If this happens, then this code won't bother to delve any deeper.
((_ (hd . tl) literals match-value fail-value success-expr)
(cond ((eq? match-value fail-value)
fail-value)
((exists-in? 'hd 'literals)
(if (eq? (maybe-car match-value fail-value) 'hd)
(match-pattern tl literals (maybe-cdr match-value fail-value)
fail-value success-expr)
fail-value))
(else
(let ((hd (maybe-car match-value fail-value)))
(if (eq? hd fail-value)
fail-value
(match-pattern tl literals (maybe-cdr match-value fail-value)
fail-value success-expr))))))
;; The pattern doesn't have to be a list. If it's not, it'll be bound to the
;; whole match-value. Control can also reach here if the non-list pattern
;; is in the cdr position of a larger pattern.
((_ non-list literals match-value fail-value success-expr)
(cond ((eq? match-value fail-value)
fail-value)
((exists-in? 'non-list 'literals)
(if (eq? 'non-list match-value)
success-expr
fail-value))
(else (let ((non-list match-value)) success-expr))))))
This example shows what it does:
> (define test '((a 2) (3 4)))
#<unspecified>
> (match-pattern ((a b) (c d)) (a) test 'fail (list b c d))
(2 3 4)
> (define test '((1 2) (3 4)))
#<unspecified>
> (match-pattern ((a b) (c d)) (a) test 'fail (list b c d))
fail
The second case fails because a
is defined as a literal, so there must be an actual a
symbol in there for the pattern
to match.
The failure-value makes it possible to evaluate a chain of these in a cond
form, which makes it possible to write a multi-pattern
match macro that mirrors syntax-rules
.
(define-syntax match
(syntax-rules (else)
((_ value literals) (error "No patterns matched"))
((_ value literals (else . body))
(begin . body))
((_ value literals (pattern . body) . rest)
(let* ((fail (gentemp))
(result (match-pattern pattern literals value fail
(begin . body))))
(if (eq? result fail)
(match value literals . rest)
result)))))
An example:
(match '((1 2) (3 4)) (a)
(((a b) (c d))
(list b c d))
(((b c) (d e))
(list (+ b c) (+ d e)))
(else 'whatever))
To demonstrate the use of this macro in conjunction with defmacro
, let's define my-cond
again using defmacro
:
(defmacro (my-cond . clauses)
(match (cons 'my-cond clauses) (else)
((x) '(if #f 'not-reached))
((x (else . body))
`(begin . ,body))
((x (bool-expression . body) . rest)
`(if ,bool-expression
(begin . ,body)
(my-cond . ,rest)))))
And that is what macros can do.