Spell Checker - Design

Design

A spell checker customarily consists of two parts:

  1. A set of routines for scanning text and extracting words, and
  2. An algorithm for comparing the extracted words against a known list of correctly spelled words (i.e., the dictionary).

The scanning routines sometimes include language-dependent algorithms for handling morphology. Even for a lightly inflected language like English, word extraction routines will need to handle such phenomena as contractions and possessives. It is unclear whether morphological analysis provides a significant benefit for English, though its benefits for highly synthetic languages such as German, Hungarian or Turkish are clear.

The word list might contain just a list of words, or it might also contain additional information, such as hyphenation points or lexical and grammatical attributes.

As an adjunct to these two components, the program's user interface will allow users to approve replacements and modify the program's operation.

One exception to the above paradigm are spell checkers which use solely statistical information, such as n-grams. This approach usually requires a lot of effort to obtain sufficient statistical information and may require a lot more runtime storage. These methods are not currently in general use. In some cases spell checkers use a fixed list of misspellings and suggestions for those misspellings; this less flexible approach is often used in paper-based correction methods, such as the see also entries of encyclopedias.

Read more about this topic:  Spell Checker

Famous quotes containing the word design:

    Teaching is the perpetual end and office of all things. Teaching, instruction is the main design that shines through the sky and earth.
    Ralph Waldo Emerson (1803–1882)

    What but design of darkness to appall?—
    If design govern in a thing so small.
    Robert Frost (1874–1963)