Record Linkage - Mathematical Model

Mathematical Model

In an application with two files, A and B, denote the rows (records) by in file A and in file B. Assign characteristics to each record. The set of records that represent identical entities is defined by

and the complement of set, namely set representing different entities is defined as

.

A vector, is defined, that contains the coded agreements and disagreements on each characteristic:

where is a subscript for the characteristics (sex, age, marital status, etc.) in the files. The conditional probabilities of observing a specific vector given, are defined as

 m(\gamma) = P \left\{ \gamma \left | (a,b) \in M \right\} = \sum_{(a, b) \in M} P \left\{\gamma\left \right\} \cdot P \left

and

 u(\gamma) = P \left\{ \gamma \left | (a,b) \in U \right\} = \sum_{(a, b) \in U} P \left\{\gamma\left \right\} \cdot P \left,
respectively.

Read more about this topic:  Record Linkage

Famous quotes containing the words mathematical and/or model:

    What is history? Its beginning is that of the centuries of systematic work devoted to the solution of the enigma of death, so that death itself may eventually be overcome. That is why people write symphonies, and why they discover mathematical infinity and electromagnetic waves.
    Boris Pasternak (1890–1960)

    Research shows clearly that parents who have modeled nurturant, reassuring responses to infants’ fears and distress by soothing words and stroking gentleness have toddlers who already can stroke a crying child’s hair. Toddlers whose special adults model kindliness will even pick up a cookie dropped from a peer’s high chair and return it to the crying peer rather than eat it themselves!
    Alice Sterling Honig (20th century)