WordNet - Database Contents

Database Contents

As of November 2012 WordNet's latest Online-version is 3.1 (announced on June 2011), but latest released version is 3.0 (released on December 2006). The 3.0 database contains 155,287 words organized in 117,659 synsets for a total of 206,941 word-sense pairs; in compressed form, it is about 12 megabytes in size.

WordNet distinguishes between nouns, verbs, adjectives and adverbs because they follow different grammatical rules. It does not include prepositions, determiners etc. Every synset contains a group of synonymous words or collocations (a collocation is a sequence of words that go together to form a specific meaning, such as "car pool"); different senses of a word are in different synsets. The meaning of the synsets is further clarified with short defining glosses (Definitions and/or example sentences). A typical example synset with gloss is:

good, right, ripe – (most suitable or right for a particular purpose; "a good time to plant tomatoes"; "the right time to act"; "the time is ripe for great sociological changes")

Most synonym sets are connected to other synsets via a number of semantic relations. These relations vary based on the type of word, and include:

  • Nouns
    • hypernyms: Y is a hypernym of X if every X is a (kind of) Y (canine is a hypernym of dog)
    • hyponyms: Y is a hyponym of X if every Y is a (kind of) X (dog is a hyponym of canine)
    • coordinate terms: Y is a coordinate term of X if X and Y share a hypernym (wolf is a coordinate term of dog, and dog is a coordinate term of wolf)
    • holonym: Y is a holonym of X if X is a part of Y (building is a holonym of window)
    • meronym: Y is a meronym of X if Y is a part of X (window is a meronym of building)
  • Verbs
    • hypernym: the verb Y is a hypernym of the verb X if the activity X is a (kind of) Y (to perceive is an hypernym of to listen)
    • troponym: the verb Y is a troponym of the verb X if the activity Y is doing X in some manner (to lisp is a troponym of to talk)
    • entailment: the verb Y is entailed by X if by doing X you must be doing Y (to sleep is entailed by to snore)
    • coordinate terms: those verbs sharing a common hypernym (to lisp and to yell)
  • Adjectives
    • related nouns
    • similar to
    • participle of verb
  • Adverbs
    • root adjectives

While semantic relations apply to all members of a synset because they share a meaning but are all mutually synonyms, words can also be connected to other words through lexical relations, including antonyms (opposites of each other) which are derivationally related, as well.

WordNet also provides the polysemy count of a word: the number of synsets that contain the word. If a word participates in several synsets (i.e. has several senses) then typically some senses are much more common than others. WordNet quantifies this by the frequency score: in which several sample texts have all words semantically tagged with the corresponding synset, and then a count provided indicating how often a word appears in a specific sense.

The morphology functions of the software distributed with the database try to deduce the lemma or root form of a word from the user's input; only the root form is stored in the database unless it has irregular inflected forms.

Read more about this topic:  WordNet

Famous quotes containing the word contents:

    The permanence of all books is fixed by no effort friendly or hostile, but by their own specific gravity, or the intrinsic importance of their contents to the constant mind of man.
    Ralph Waldo Emerson (1803–1882)