Adjusted Mutual Information - Adjustment For Chance

Adjustment For Chance

Like the Rand index, the baseline value of mutual information between two random clusterings does not take on a constant value, and tends to be larger when the two partitions have a larger number of clusters (with a fixed number of set elements N). By adopting a hypergeometric model of randomness, it can be shown that the expected mutual information between two random clusterings is:

\begin{align} E\{MI(U,V)\} = &
\sum_{i=1}^R \sum_{j=1}^C
\sum_{n_{ij}=(a_i+b_j-N)^+}^{\min(a_i, b_j)}
\frac{n_{ij}}{N}
\log \left( \frac{ N\cdot n_{ij}}{a_i b_j}\right) \times \\
& \frac{a_i!b_j!(N-a_i)!(N-b_j)!}
{N!n_{ij}!(a_i-n_{ij})!(b_j-n_{ij})!(N-a_i-b_j+n_{ij})!} \\
\end{align}

where denotes . The variables and are partial sums of the contingency table; that is,

and

The adjusted measure for the mutual information may then be defined to be:

 AMI(U,V)= \frac{MI(U,V)-E\{MI(U,V)\}} {\max{\{H(U),H(V)\}}-E\{MI(U,V)\}}
.

The AMI takes a value of 1 when the two partitions are identical and 0 when the MI between two partitions equals to that expected by chance.

Read more about this topic:  Adjusted Mutual Information

Famous quotes containing the words adjustment and/or chance:

    The terror of the atom age is not the violence of the new power but the speed of man’s adjustment to it—the speed of his acceptance.
    —E.B. (Elwyn Brooks)

    There is a history in all men’s lives,
    Figuring the natures of the times deceased,
    The which observed, a man may prophesy,
    With a near aim, of the main chance of things
    As yet not come to life.
    William Shakespeare (1564–1616)