Record Linkage - Mathematical Model

Mathematical Model

In an application with two files, A and B, denote the rows (records) by in file A and in file B. Assign characteristics to each record. The set of records that represent identical entities is defined by

and the complement of set, namely set representing different entities is defined as

.

A vector, is defined, that contains the coded agreements and disagreements on each characteristic:

where is a subscript for the characteristics (sex, age, marital status, etc.) in the files. The conditional probabilities of observing a specific vector given, are defined as

 m(\gamma) = P \left\{ \gamma \left | (a,b) \in M \right\} = \sum_{(a, b) \in M} P \left\{\gamma\left \right\} \cdot P \left

and

 u(\gamma) = P \left\{ \gamma \left | (a,b) \in U \right\} = \sum_{(a, b) \in U} P \left\{\gamma\left \right\} \cdot P \left,
respectively.

Read more about this topic:  Record Linkage

Famous quotes containing the words mathematical and/or model:

    It is by a mathematical point only that we are wise, as the sailor or the fugitive slave keeps the polestar in his eye; but that is sufficient guidance for all our life. We may not arrive at our port within a calculable period, but we would preserve the true course.
    Henry David Thoreau (1817–1862)

    The Battle of Waterloo is a work of art with tension and drama with its unceasing change from hope to fear and back again, change which suddenly dissolves into a moment of extreme catastrophe, a model tragedy because the fate of Europe was determined within this individual fate.
    Stefan Zweig (18811942)