Record Linkage - Mathematical Model

Mathematical Model

In an application with two files, A and B, denote the rows (records) by in file A and in file B. Assign characteristics to each record. The set of records that represent identical entities is defined by

and the complement of set, namely set representing different entities is defined as

.

A vector, is defined, that contains the coded agreements and disagreements on each characteristic:

where is a subscript for the characteristics (sex, age, marital status, etc.) in the files. The conditional probabilities of observing a specific vector given, are defined as

 m(\gamma) = P \left\{ \gamma \left | (a,b) \in M \right\} = \sum_{(a, b) \in M} P \left\{\gamma\left \right\} \cdot P \left

and

 u(\gamma) = P \left\{ \gamma \left | (a,b) \in U \right\} = \sum_{(a, b) \in U} P \left\{\gamma\left \right\} \cdot P \left,
respectively.

Read more about this topic:  Record Linkage

Famous quotes containing the words mathematical and/or model:

    It is by a mathematical point only that we are wise, as the sailor or the fugitive slave keeps the polestar in his eye; but that is sufficient guidance for all our life. We may not arrive at our port within a calculable period, but we would preserve the true course.
    Henry David Thoreau (1817–1862)

    Your home is regarded as a model home, your life as a model life. But all this splendor, and you along with it ... it’s just as though it were built upon a shifting quagmire. A moment may come, a word can be spoken, and both you and all this splendor will collapse.
    Henrik Ibsen (1828–1906)