Knowledge Discovery - Overview

Overview

After the standardization of knowledge representation languages such as RDF and OWL, much research has been conducted in the area, especially regarding transforming relational databases into RDF, Entity resolution, Knowledge Discovery and Ontology Learning. The general process uses traditional methods from Information Extraction and ETL, which transform the data from the sources into structured formats.

The following criteria can be used to categorize approaches in this topic (some of them only account for extraction from relational databases):

Source Which data sources are covered: Text, Relational Databases, XML, CSV
Exposition How is the extracted knowledge made explicit (Ontology file, Semantic Database)? How can you query it?
Synchronization Is the knowledge extraction process executed once to produce a dump or is the result synchronized with the source? Static or Dynamic. Are changes to the result written back (Bi-directional)
Reuse of vocabularies The tool is able to reuse existing vocabularies in the extraction. For example the table column 'firstName' can be mapped to foaf:firstName. Some automatic approaches are not capable of mapping vocab.
Automatisation The degree to which the extraction is assisted/automated. Manual, GUI, semi-automatic, automatic.
Requires a Domain Ontology A pre-existing ontology is needed to map to it. So either a mapping is created or a schema is learned from the source (Ontology learning).

Read more about this topic:  Knowledge Discovery