In corpus linguistics, part-of-speech tagging (POS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its context—i.e. relationship with adjacent and related words in a phrase, sentence, or paragraph. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.
Once performed by hand, POS tagging is now done in the context of computational linguistics, using algorithms which associate discrete terms, as well as hidden parts of speech, in accordance with a set of descriptive tags. POS-tagging algorithms fall into two distinctive groups: rule-based and stochastic. E. Brill's tagger, one of the first and widely used English POS-taggers, employs rule-based algorithms.
Famous quotes containing the word tagging:
“The 5307th has collapsed. From a medical viewpoint, theyre finished as a fighting unit.... I have never seen human beings in such condition. Theyre drained, physically and psychologically drained. Im not tagging them for specific ailments. Im simply marking every man in the outfit A.O.E.accumulation of everything.”
—Samuel Fuller, U.S. screenwriter, and Milton Sperling. Samuel Fuller. Doc (Andrew Duggan)