In machine learning, semantic analysis of a corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents.
Latent semantic analysis (sometimes latent semantic indexing), is a class of techniques where documents are represented as vectors in term space. A prominent example is PLSI.
Latent Dirichlet allocation involves attributing document terms to topics.
n-grams and hidden Markov models work by representing the term stream as a markov chain where each term is derived from the few terms before it.
Famous quotes containing the words semantic and/or analysis:
“Watts need of semantic succour was at times so great that he would set to trying names on things, and on himself, almost as a woman hats.”
—Samuel Beckett (19061989)
“Ask anyone committed to Marxist analysis how many angels on the head of a pin, and you will be asked in return to never mind the angels, tell me who controls the production of pins.”
—Joan Didion (b. 1934)