Text Segmentation

Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental processes used by humans when reading text, and to artificial processes implemented in computers, which are the subject of natural language processing. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written English and the distinctive initial, medial and final letter shapes of Arabic, such signals are sometimes ambiguous and not present in all written languages.

Compare speech segmentation, the process of dividing speech into linguistically meaningful portions.

Read more about Text Segmentation:  Automatic Segmentation Approaches, See Also

Famous quotes containing the word text:

    If ever I should condescend to prose,
    I’ll write poetical commandments, which
    Shall supersede beyond all doubt all those
    That went before; in these I shall enrich
    My text with many things that no one knows,
    And carry precept to the highest pitch:
    I’ll call the work ‘Longinus o’er a Bottle,
    Or, Every Poet his own Aristotle.’
    George Gordon Noel Byron (1788–1824)