List of Natural Language Processing Toolkits - Structures Used in Natural Language Processing

Structures Used in Natural Language Processing

Corpus – body of data, optionally tagged (for example, through part-of-speech tagging), providing real world samples for analysis and comparison.
- Text corpus – large and structured set of texts, nowadays usually electronically stored and processed. They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific subject (or domain).
- Speech corpus – database of speech audio files and text transcriptions. In Speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition engine). In Linguistics, spoken corpora are used to do research into phonetic, conversation analysis, dialectology and other fields.

Read more about this topic: List Of Natural Language Processing Toolkits

Famous quotes containing the words structures, natural and/or language:

“If there are people who feel that God wants them to change the structures of society, that is something between them and their God. We must serve him in whatever way we are called. I am called to help the individual; to love each poor person. Not to deal with institutions. I am in no position to judge.”
—Mother Teresa (b. 1910)

“May she be granted beauty and yet not
Beauty to make a stranger’s eye distraught,
Or hers before a looking-glass, for such,
Being made beautiful overmuch,
Consider beauty a sufficient end,
Lose natural kindness”
—William Butler Yeats (1865–1939)

“The face of the water, in time, became a wonderful book—a book that was a dead language to the uneducated passenger, but which told its mind to me without reserve, delivering its most cherished secrets as clearly as if it uttered them with a voice. And it was not a book to be read once and thrown aside, for it had a new story to tell every day.”
—Mark Twain [Samuel Langhorne Clemens] (1835–1910)