Regular Expression - Fuzzy Regular Expressions

Fuzzy Regular Expressions

Variants of regular expressions can be used for working with text in natural language, when it is necessary to take into account possible typos and spelling variants. For example, the text "Julius Caesar" might be a fuzzy match for:

  • Gaius Julius Caesar
  • Yulius Cesar
  • G. Juliy Caezar

In such cases the mechanism implements some fuzzy string matching algorithm and possibly some algorithm for finding the similarity between text fragment and pattern.

This task is closely related to both full text search and named entity recognition.

Some software libraries work with fuzzy regular expressions:

  • TRE - well-developed portable free project in C, which uses syntax similar to POSIX
  • FREJ - open source project in Java with non-standard syntax (which utilizes prefix, Lisp-like notation), targeted to allow easy use of substitutions of inner matched fragments in outer blocks, but lacks many features of standard regular expressions.
  • agrep - command-line utility (proprietary, but free for non-commercial usage).

Read more about this topic:  Regular Expression

Famous quotes containing the words fuzzy, regular and/or expressions:

    What do you think of us in fuzzy endeavor, you whose directions are sterling, whose lunge is straight?
    Can you make a reason, how can you pardon us who memorize the rules and never score?
    Gwendolyn Brooks (b. 1917)

    This is the frost coming out of the ground; this is Spring. It precedes the green and flowery spring, as mythology precedes regular poetry. I know of nothing more purgative of winter fumes and indigestions. It convinces me that Earth is still in her swaddling-clothes, and stretches forth baby fingers on every side.
    Henry David Thoreau (1817–1862)

    Our books are false by being fragmentary: their sentences are bon mots, and not parts of natural discourse; childish expressions of surprise or pleasure in nature; or, worse, owing a brief notoriety to their petulance, or aversion from the order of nature,—being some curiosity or oddity, designedly not in harmony with nature, and purposely framed to excite surprise, as jugglers do by concealing their means.
    Ralph Waldo Emerson (1803–1882)