Record Linkage - Naming Conventions

Naming Conventions

"Record linkage" is the term used by statisticians, epidemiologists, and historians, among others, to describe the process of joining records from one data source with another that describe the same entity. Commercial mail and database applications refer to it as "merge/purge processing" or "list washing". Computer scientists often refer to it as "data matching" or as the "object identity problem". Other names used to describe the same concept include: "coreference/entity/identity/name/record resolution", "entity disambiguation/linking", "duplicate detection", "deduplication", "record matching", "(reference) reconciliation", "object identification", "data/information integration", and "conflation". This profusion of terminology has led to few cross-references between these research communities.

While they share similar names, record linkage and Linked Data are two separate concepts. Whereas record linkage focuses on the more narrow task of identifying matching entities across different data sets, Linked Data focuses on the broader methods of structuring and publishing data to facilitate the discovery of related information.

Read more about this topic:  Record Linkage

Famous quotes containing the words naming and/or conventions:

    The night is itself sleep
    And what goes on in it, the naming of the wind,
    Our notes to each other, always repeated, always the same.
    John Ashbery (b. 1927)

    Languages exist by arbitrary institutions and conventions among peoples; words, as the dialecticians tell us, do not signify naturally, but at our pleasure.
    François Rabelais (1494–1553)