List of Natural Language Processing Toolkits - Structures Used in Natural Language Processing

Structures Used in Natural Language Processing

  • Corpus – body of data, optionally tagged (for example, through part-of-speech tagging), providing real world samples for analysis and comparison.
    • Text corpus – large and structured set of texts, nowadays usually electronically stored and processed. They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific subject (or domain).
    • Speech corpus – database of speech audio files and text transcriptions. In Speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition engine). In Linguistics, spoken corpora are used to do research into phonetic, conversation analysis, dialectology and other fields.

Read more about this topic:  List Of Natural Language Processing Toolkits

Famous quotes containing the words structures, natural and/or language:

    The philosopher believes that the value of his philosophy lies in its totality, in its structure: posterity discovers it in the stones with which he built and with which other structures are subsequently built that are frequently better—and so, in the fact that that structure can be demolished and yet still possess value as material.
    Friedrich Nietzsche (1844–1900)

    Doubtless, we are as slow to conceive of Paradise as of Heaven, of a perfect natural as of a perfect spiritual world. We see how past ages have loitered and erred. “Is perhaps our generation free from irrationality and error? Have we perhaps reached now the summit of human wisdom, and need no more to look out for mental or physical improvement?” Undoubtedly, we are never so visionary as to be prepared for what the next hour may bring forth.
    Henry David Thoreau (1817–1862)

    A president, however, must stand somewhat apart, as all great presidents have known instinctively. Then the language which has the power to survive its own utterance is the most likely to move those to whom it is immediately spoken.
    J.R. Pole (b. 1922)