Latent Semantic Indexing

Latent semantic indexing (LSI) is an indexing and retrieval method that uses a mathematical technique called singular value decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words that are used in the same contexts tend to have similar meanings. A key feature of LSI is its ability to extract the conceptual content of a body of text by establishing associations between those terms that occur in similar contexts.

LSI is also an application of correspondence analysis, a multivariate statistical technique developed by Jean-Paul Benzécri in the early 1970s, to a contingency table built from word counts in documents.

Called Latent Semantic Indexing because of its ability to correlate semantically related terms that are latent in a collection of text, it was first applied to text at Bell Laboratories in the late 1980s. The method, also called latent semantic analysis (LSA), uncovers the underlying latent semantic structure in the usage of words in a body of text and how it can be used to extract the meaning of the text in response to user queries, commonly referred to as concept searches. Queries, or concept searches, against a set of documents that have undergone LSI will return results that are conceptually similar in meaning to the search criteria even if the results don’t share a specific word or words with the search criteria.


Read more about Latent Semantic Indexing:  Benefits of LSI, LSI Timeline, Mathematics of LSI, Querying and Augmenting LSI Vector Spaces, Additional Uses of LSI, Challenges To LSI, See Also

Famous quotes containing the words latent and/or semantic:

    Perhaps having built a barricade when you’re sixteen provides you with a sort of safety rail. If you’ve once taken part in building one, even inadvertently, doesn’t its usually latent image reappear like a warning signal whenever you’re tempted to join the police, or support any manifestation of Law and Order?
    Jean Genet (1910–1986)

    Watt’s need of semantic succour was at times so great that he would set to trying names on things, and on himself, almost as a woman hats.
    Samuel Beckett (1906–1989)