Text Segmentation

Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental processes used by humans when reading text, and to artificial processes implemented in computers, which are the subject of natural language processing. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written English and the distinctive initial, medial and final letter shapes of Arabic, such signals are sometimes ambiguous and not present in all written languages.

Compare speech segmentation, the process of dividing speech into linguistically meaningful portions.

Read more about Text Segmentation:  Automatic Segmentation Approaches, See Also

Famous quotes containing the word text:

    I would define the poetic effect as the capacity that a text displays for continuing to generate different readings, without ever being completely consumed.
    Umberto Eco (b. 1932)