Hygienic Macro - The Hygiene Problem

The Hygiene Problem

In programming languages that have unhygienic macro systems, it is possible for existing variable bindings to be hidden from a macro by variable bindings that are created during its expansion. In C, this problem can be illustrated by the following fragment:

#define INCI(i) {int a=0; ++i;} int main(void) { int a = 0, b = 0; INCI(a); INCI(b); printf("a is now %d, b is now %d\n", a, b); return 0; }

Running the above through the C preprocessor produces:

int main(void) { int a = 0, b = 0; {int a=0; ++a;}; {int a=0; ++b;}; printf("a is now %d, b is now %d\n", a, b); return 0; }

So the variable a declared in the top scope is never altered by the execution of the program, as the output of the compiled program shows:

a is now 0, b is now 1

Note that some C compilers, such as gcc, have an option like -Wshadow that warns when a local variable shadows a global variable, which would have caught the above problem. The simplest and least robust solution is to give the macro's variables unique names:

#define INCI(i) {int INCIa=0; ++i;} int main(void) { int a = 0, b = 0; INCI(a); INCI(b); printf("a is now %d, b is now %d\n", a, b); return 0; }

Until a variable named INCIa is created, this solution produces the correct output:

a is now 1, b is now 1

The "hygiene problem" can extend beyond variable bindings. Consider this Common Lisp macro:

(defmacro my-unless (condition &body body) `(if (not ,condition) (progn ,@body)))

While there are no references to variables in this macro, it assumes the symbols "if", "not", and "progn" are all bound to their usual function definitions. If, however the above macro is used in the following code:

(flet ((not (x) x)) (my-unless t (format t "This should not be printed!")))

Because the definition of "not" has been locally altered, the behavior is undefined. Redefining standard functions and operators, globally or locally, invokes undefined behavior according to ANSI Common Lisp. Such usage can be diagnosed by the implementation as erroneous.

On the other hand, hygienic macro systems preserve the lexical scoping of all identifiers (such as "if" and "not") automatically. This property is called referential transparency.

Of course, the problem can occur for program-defined functions which are not protected in the same way:

(defmacro my-unless (condition &body body) `(if (user-defined-operator ,condition) (progn ,@body))) (flet ((user-defined-operator (x) x)) (my-unless t (format t "This should not be printed!")))

The Common Lisp solution to this problem is to use packages. The my-unless macro can reside in its own package, where user-defined-operator is a private symbol in that package. The symbol user-defined-operator occurring in the user code will then be a different symbol, unrelated to the one used in the macro.

Meanwhile, languages such as Scheme that use hygienic macros prevent accidental capture and ensure referential transparency automatically as part of the macro expansion process. In cases where accidental capture is desired, some systems allow the programmer to explicitly violate the hygiene mechanisms of the macro system.

For example, the following Scheme implementation of my-unless will have the desired behavior:

(define-syntax my-unless (syntax-rules [(_ condition body) (if (not condition) body (void))])) (let (my-unless #t (displayln "This should not be printed!")))

Read more about this topic:  Hygienic Macro

Famous quotes containing the word problem:

    The general public is easy. You don’t have to answer to anyone; and as long as you follow the rules of your profession, you needn’t worry about the consequences. But the problem with the powerful and rich is that when they are sick, they really want their doctors to cure them.
    Molière [Jean Baptiste Poquelin] (1622–1673)