History of Compiler Construction - Parser Generators

Parser Generators

For a more complete list, which also includes LL, SLR, GLR and LR parser generators, see Comparison of parser generators

A parser generator generates the lexical-analyser portion of a compiler. It is a program that takes a description of a formal grammar of a specific programming language and produces a parser for that language. That parser can be used in a compiler for that specific language. The parser detects and identifies the reserved words and symbols of the specific language from a stream of text and returns these as tokens to the code which implements the syntactic validation and translation into object code. This second part of the compiler can also be created by a compiler-compiler using a formal rules-of-precedence syntax-description as input.

The first compiler-compiler to use that name was written by Tony Brooker in 1960 and was used to create compilers for the Atlas computer at the University of Manchester, including the Atlas Autocode compiler. However it was rather different from modern compiler-compilers, and today would probably be described as being somewhere between a highly customisable generic compiler and an extensible-syntax language. The name 'compiler-compiler' was far more appropriate for Brooker's system than it is for most modern compiler-compilers, which are more accurately described as parser generators. It is almost certain that the "Compiler Compiler" name has entered common use due to Yacc rather than Brooker's work being remembered.

In the early 1960s, Robert McClure at Texas Instruments invented a compiler-compiler called TMG, the name taken from "transmogrification". In the following years TMG was ported to several UNIVAC and IBM mainframe computers.

Another early example of a parser generator is META II, first released in 1962 by Val Schorre. META II accepted grammars and code generation rules, and was able to compile itself and other languages. META II was the first documented version of a metacompiler. It also translated to one of the earliest instances of a virtual machine.

The Multics project, a joint venture between MIT and Bell Labs, was one of the first to develop an operating system in a high level language. PL/I was chosen as the language, but an external supplier could not supply a working compiler. The Multics team developed their own subset dialect of PL/I known as Early PL/I (EPL) as their implementation language in 1964. TMG was ported to GE-600 series and used to develop EPL by Douglas McIlroy, Robert Morris, and others.

Not long after Ken Thompson wrote the first version of Unix for the PDP-7 in 1969, Doug McIlroy created the new system's first higher-level language: an implementation of McClure's TMG. TMG was also the compiler definition tool used by Ken Thompson to write the compiler for the B language on his PDP-7 in 1970. B was the immediate ancestor of C.

An early LALR parser generator was called "TWS", created by Frank DeRemer and Tom Pennello.

Read more about this topic:  History Of Compiler Construction