Parsing Expression Grammar

Advantages

Compared to pure regular expressions (i.e. without back-references), PEGs are strictly more powerful, but require significantly more memory. For example, a regular expression inherently cannot find an arbitrary number of matched pairs of parentheses, because it is not recursive, but a PEG can. However, a PEG will require an amount of memory proportional to the length of the input, while a regular expression matcher will require only a constant amount of memory.

Any PEG can be parsed in linear time by using a packrat parser, as described above.

Parsers for languages expressed as a CFG, such as LR parsers, require a separate tokenization step to be done first, which breaks up the input based on the location of spaces, punctuation, etc. The tokenization is necessary because of the way these parsers use lookahead to parse CFGs that meet certain requirements in linear time. PEGs do not require tokenization to be a separate step, and tokenization rules can be written in the same way as any other grammar rule.

Many CFGs contain ambiguities, even when they're intended to describe unambiguous languages. The "dangling else" problem in C, C++, and Java is one example. These problems are often resolved by applying a rule outside of the grammar. In a PEG, these ambiguities never arise, because of prioritization.

Read more about this topic: Parsing Expression Grammar

Famous quotes containing the word advantages:

“But there are advantages to being elected President. The day after I was elected, I had my high school grades classified Top Secret.”
—Ronald Reagan (b. 1911)

“There is no one thoroughly despicable. We cannot descend much lower than an idiot; and an idiot has some advantages over a wise man.”
—William Hazlitt (1778–1830)

“No advantages in this world are pure and unmixed.”
—David Hume (1711–1776)

Parsing Expression Grammar - Advantages

Famous quotes containing the word advantages: