Cross-language Information Retrieval

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. For example, a user may pose their query in English but retrieve relevant documents written in French. To do so, most of CLIR systems use translation techniques. CLIR techniques can be classified into different categories based on different translation resources:

  • Dictionary-based CLIR techniques
  • Parallel corpora based CLIR techniques
  • Comparable corpora based CLIR techniques
  • Machine translator based CLIR techniques

The first workshop on CLIR was held in Zürich during the SIGIR-96 conference. Workshops have been held yearly since 2000 at the meetings of the Cross Language Evaluation Forum (CLEF).

The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers to CLIR in general, but it also has a specific meaning of cross-language information retrieval where a document collection is multilingual.

Famous quotes containing the word information:

    The information links are like nerves that pervade and help to animate the human organism. The sensors and monitors are analogous to the human senses that put us in touch with the world. Data bases correspond to memory; the information processors perform the function of human reasoning and comprehension. Once the postmodern infrastructure is reasonably integrated, it will greatly exceed human intelligence in reach, acuity, capacity, and precision.
    Albert Borgman, U.S. educator, author. Crossing the Postmodern Divide, ch. 4, University of Chicago Press (1992)