Hapax Legomenon - Computer Science

Computer Science

In the fields of computational linguistics and natural language processing (NLP), esp. corpus linguistics and machine-learned NLP, it is common to disregard hapax legomena (and sometimes other infrequent words), as they are likely to have little value for computational techniques. This disregard has the added benefit of significantly reducing the memory use of an application, since, by Zipf's law, many words are hapaxes.

Read more about this topic:  Hapax Legomenon

Famous quotes containing the words computer and/or science:

    What, then, is the basic difference between today’s computer and an intelligent being? It is that the computer can be made to see but not to perceive. What matters here is not that the computer is without consciousness but that thus far it is incapable of the spontaneous grasp of pattern—a capacity essential to perception and intelligence.
    Rudolf Arnheim (b. 1904)

    What is done for science must also be done for art: accepting undesirable side effects for the sake of the main goal, and moreover diminishing their importance by making this main goal more magnificent. For one should reform forward, not backward: social illnesses, revolutions, are evolutions inhibited by a conserving stupidity.
    Robert Musil (1880–1942)