Semantic Search - Disambiguation

Disambiguation

In order to understand what a user is searching for, word sense disambiguation must occur. When a term is ambiguous, meaning it can have several meanings (for example, if one considers the lemma "bark", which can be understood as "the sound of a dog," "the skin of a tree," or "a three-masted sailing ship"), the disambiguation process is started, thanks to which the most probable meaning is chosen from all those possible.

Such processes make use of other information present in a semantic analysis system and takes into account the meanings of other words present in the sentence and in the rest of the text. The determination of every meaning, in substance, influences the disambiguation of the others, until a situation of maximum plausibility and coherence is reached for the sentence. All the fundamental information for the disambiguation process, that is, all the knowledge used by the system, is represented in the form of a semantic network, organized on a conceptual basis.

In a structure of this type, every lexical concept coincides therefore with a semantic network node and is linked to others by specific semantic relationships in a hierarchical and hereditary structure. In this way, each concept is enriched with the characteristics and meaning of the nearby nodes.

Every node of the network (called Synset) groups a set of synonyms which represent the same lexical concept (called Synsets) and can contain:

  • single lemmata ('seat', 'vacation'; 'work', 'quick'; 'quickly', 'more', etc.)
  • compounds ('non-stop', 'abat-jour', 'policeman')
  • collocations ('credit card', 'university degree', 'treasury stock', 'go forward', etc.)

The semantic relationships (links), which identify the semantic relationships between the synsets, are the order principals for the organization of the semantic network concepts.

Read more about this topic:  Semantic Search