Factored Language Model

The factored language model (FLM) is an extension of a conventional language model. In an FLM, each word is viewed as a vector of k factors: An FLM provides the probabilistic model where the prediction of a factor is based on parents . For example, if represents a word token and represents a Part of speech tag for English, the expression gives a model for predicting current word token based on a traditional Ngram model as well as the Part of speech tag of the previous word.

A major advantage of factored language models is that they allow users to specify linguistic knowledge such as the relationship between word tokens and Part of speech in English, or morphological information (stems, root, etc.) in Arabic.

Like N-gram models, smoothing techniques are necessary in parameter estimation. In particular, generalized back-off is used in training an FLM.

Famous quotes containing the words language and/or model:

    We might hypothetically possess ourselves of every technological resource on the North American continent, but as long as our language is inadequate, our vision remains formless, our thinking and feeling are still running in the old cycles, our process may be “revolutionary” but not transformative.
    Adrienne Rich (b. 1929)

    ... if we look around us in social life and note down who are the faithful wives, the most patient and careful mothers, the most exemplary housekeepers, the model sisters, the wisest philanthropists, and the women of the most social influence, we will have to admit that most frequently they are women of cultivated minds, without which even warm hearts and good intentions are but partial influences.
    Mrs. H. O. Ward (1824–1899)