Knowledge Discovery - Extraction From Natural Language Sources

Extraction From Natural Language Sources

The biggest portion of information contained in business documents, even about 80%, is encoded in natural language and therefore unstructured. Because unstructured data are rather badly suited to extract knowledge from it, it is necessary to apply more complex methods, which nevertheless generally supply worse results, than it would be possible for structured data. The massive acquisition of extracted knowledge should compensate the increased complexity and decreased quality of extraction. In the following, natural language sources are understood as sources of information, where the data are given in an unstructured fashion as plain text. But the text can be additionally embedded in a markup document (e. g. HTML document), because the most of the systems remove the markup elements automatically.

Read more about this topic:  Knowledge Discovery

Famous quotes containing the words extraction, natural, language and/or sources:

    Logic is the last scientific ingredient of Philosophy; its extraction leaves behind only a confusion of non-scientific, pseudo problems.
    Rudolf Carnap (1891–1970)

    All great amusements are dangerous to the Christian life; but among all those which the world has invented there is none more to be feared than the theater. It is a representation of the passions so natural and so delicate that it excites them and gives birth to them in our hearts, and, above all, to that of love.
    Blaise Pascal (1623–1662)

    Experiment is necessary in establishing an academy, but certain principles must apply to this business of art as to any other business which affects the artis tic sense of the community. Great art speaks a language which every intelligent person can understand. The people who call themselves modernists today speak a different language.
    Robert Menzies (1894–1978)

    The sources of poetry are in the spirit seeking completeness.
    Muriel Rukeyser (1913–1980)