Record Linkage - Mathematical Model

Mathematical Model

In an application with two files, A and B, denote the rows (records) by in file A and in file B. Assign characteristics to each record. The set of records that represent identical entities is defined by

and the complement of set, namely set representing different entities is defined as

.

A vector, is defined, that contains the coded agreements and disagreements on each characteristic:

where is a subscript for the characteristics (sex, age, marital status, etc.) in the files. The conditional probabilities of observing a specific vector given, are defined as

 m(\gamma) = P \left\{ \gamma \left | (a,b) \in M \right\} = \sum_{(a, b) \in M} P \left\{\gamma\left \right\} \cdot P \left

and

 u(\gamma) = P \left\{ \gamma \left | (a,b) \in U \right\} = \sum_{(a, b) \in U} P \left\{\gamma\left \right\} \cdot P \left,
respectively.

Read more about this topic:  Record Linkage

Famous quotes containing the words mathematical and/or model:

    As we speak of poetical beauty, so ought we to speak of mathematical beauty and medical beauty. But we do not do so; and that reason is that we know well what is the object of mathematics, and that it consists in proofs, and what is the object of medicine, and that it consists in healing. But we do not know in what grace consists, which is the object of poetry.
    Blaise Pascal (1623–1662)

    Socrates, who was a perfect model in all great qualities, ... hit on a body and face so ugly and so incongruous with the beauty of his soul, he who was so madly in love with beauty.
    Michel de Montaigne (1533–1592)