Hapax Legomenon - Computer Science

Computer Science

In the fields of computational linguistics and natural language processing (NLP), esp. corpus linguistics and machine-learned NLP, it is common to disregard hapax legomena (and sometimes other infrequent words), as they are likely to have little value for computational techniques. This disregard has the added benefit of significantly reducing the memory use of an application, since, by Zipf's law, many words are hapaxes.

Read more about this topic:  Hapax Legomenon

Famous quotes containing the words computer and/or science:

    The archetype of all humans, their ideal image, is the computer, once it has liberated itself from its creator, man. The computer is the essence of the human being. In the computer, man reaches his completion.
    Friedrich Dürrenmatt (1921–1990)

    The belief that established science and scholarship—which have so relentlessly excluded women from their making—are “objective” and “value-free” and that feminist studies are “unscholarly,” “biased,” and “ideological” dies hard. Yet the fact is that all science, and all scholarship, and all art are ideological; there is no neutrality in culture!
    Adrienne Rich (b. 1929)