Information Extraction - World Wide Web Applications

World Wide Web Applications

IE has been the focus of the MUC conferences. The proliferation of the Web, however, intensified the need for developing IE systems that help people to cope with the enormous amount of data that is available online. Systems that perform IE from online text should meet the requirements of low cost, flexibility in development and easy adaptation to new domains. MUC systems fail to meet those criteria. Moreover, linguistic analysis performed for unstructured text does not exploit the HTML/XML tags and layout format that are available in online text. As a result, less linguistically intensive approaches have been developed for IE on the Web using wrappers, which are sets of highly accurate rules that extract a particular page's content. Manually developing wrappers has proved to be a time-consuming task, requiring a high level of expertise. Machine learning techniques, either supervised or unsupervised, have been used to induce such rules automatically.

Wrappers typically handle highly structured collections of web pages, such as product catalogues and telephone directories. They fail, however, when the text type is less structured, which is also common on the Web. Recent effort on adaptive information extraction motivates the development of IE systems that can handle different types of text, from well-structured to almost free text -where common wrappers fail- including mixed types. Such systems can exploit shallow natural language knowledge and thus can be also applied to less structured text.

Read more about this topic:  Information Extraction

Famous quotes containing the words world, wide and/or web:

    We try to go back. You know I’ll probably die just a few miles from where I drew my first breath. That would have seemed like a horrible prospect to me, back when I was young and ambitious and gonna set the world on fire. But there’s comfort in knowing you’re gonna go full circle, end up where you started out. I’ve said before that I want to live my last days where folks know when you’re sick and care when you die.
    Lyndon Baines Johnson (1908–1973)

    Hail, hail, plump paunch, O the founder of taste
    For fresh meats, or powdered, or pickle, or paste;
    Devourer of broiled, baked, roasted or sod,
    And emptier of cups, be they even or odd;
    All which have now made thee so wide i’ the waist
    As scarce with no pudding thou art to be laced;
    But eating and drinking until thou dost nod,
    Thou break’st all thy girdles, and break’st forth a god.
    Ben Jonson (1572–1637)

    However, our fates at least are social. Our courses do not diverge; but as the web of destiny is woven it is fulled, and we are cast more and more into the centre. Men naturally, though feebly, seek this alliance, and their actions faintly foretell it.
    Henry David Thoreau (1817–1862)