Knowledge Discovery - Extraction From Natural Language Sources

Extraction From Natural Language Sources

The biggest portion of information contained in business documents, even about 80%, is encoded in natural language and therefore unstructured. Because unstructured data are rather badly suited to extract knowledge from it, it is necessary to apply more complex methods, which nevertheless generally supply worse results, than it would be possible for structured data. The massive acquisition of extracted knowledge should compensate the increased complexity and decreased quality of extraction. In the following, natural language sources are understood as sources of information, where the data are given in an unstructured fashion as plain text. But the text can be additionally embedded in a markup document (e. g. HTML document), because the most of the systems remove the markup elements automatically.

Read more about this topic:  Knowledge Discovery

Famous quotes containing the words extraction, natural, language and/or sources:

    Logic is the last scientific ingredient of Philosophy; its extraction leaves behind only a confusion of non-scientific, pseudo problems.
    Rudolf Carnap (1891–1970)

    A person who can’t pay gets another person who can’t pay to guarantee that he can pay. Like a person with two wooden legs getting another person with two wooden legs to guarantee that he has got two natural legs. It don’t make either of them able to do a walking-match.
    Charles Dickens (1812–1870)

    Play for young children is not recreation activity,... It is not leisure-time activity nor escape activity.... Play is thinking time for young children. It is language time. Problem-solving time. It is memory time, planning time, investigating time. It is organization-of-ideas time, when the young child uses his mind and body and his social skills and all his powers in response to the stimuli he has met.
    James L. Hymes, Jr. (20th century)

    No drug, not even alcohol, causes the fundamental ills of society. If we’re looking for the sources of our troubles, we shouldn’t test people for drugs, we should test them for stupidity, ignorance, greed and love of power.
    —P.J. (Patrick Jake)