Hapax Legomenon - Computer Science

Computer Science

In the fields of computational linguistics and natural language processing (NLP), esp. corpus linguistics and machine-learned NLP, it is common to disregard hapax legomena (and sometimes other infrequent words), as they are likely to have little value for computational techniques. This disregard has the added benefit of significantly reducing the memory use of an application, since, by Zipf's law, many words are hapaxes.

Read more about this topic:  Hapax Legomenon

Famous quotes containing the words computer and/or science:

    Family life is not a computer program that runs on its own; it needs continual input from everyone.
    Neil Kurshan (20th century)

    It is unheard-of, uncivilized barbarism that any woman should still be forced to bear such monstrous torture. It should be remedied. It should be stopped. It is simply absurd that, with our modern science, painless childbirth does not exist as a matter of course.... I tremble with indignation when I think of ... the unspeakable egotism and blindness of men of science who permit such atrocities when they can be remedied.
    Isadora Duncan (1878–1927)