Mutual Information of Two Partitions
Given a set S of N elements, consider two partitions of S, namely with R clusters, and with C clusters. It is presumed here that the partitions are so-called hard clusters, the partitions are pairwise disjoint:
for all, and complete:
The mutual information of cluster overlap between U and V can be summarized in the form of an RxC contingency table, where denotes the number of objects that are common to clusters and . That is,
Suppose an object is picked at random from S; the probability that the object falls into cluster is:
The entropy associated with the partitioning U is:
H(U) is non-negative and takes the value 0 only when there is no uncertainty determining an object's cluster membership, i.e., when there is only one cluster. Similarly, the entropy of the clustering V can be calculated as:
where . The mutual information (MI) between two partitions:
where P(i,j) denotes the probability that a point belongs to both the cluster in U and cluster in V:
MI is a non-negative quantity upper bounded by the entropies H(U) and H(V). It quantifies the information shared by the two clusterings and thus can be employed as a clustering similarity measure.
Read more about this topic: Adjusted Mutual Information
Famous quotes containing the words mutual, information and/or partitions:
“Louise Bryant: Im sorry if you dont believe in mutual independence and free love and respect.
Eugene ONeill: Dont give me a lot of parlor socialism that you learned in the village. If you were mine, I wouldnt share you with anybody or anything. It would be just you and me. Youd be at the center of it all. You know it would feel a lot more like love than being left alone with your work.”
—Warren Beatty (b. 1937)
“When action grows unprofitable, gather information; when information grows unprofitable, sleep.”
—Ursula K. Le Guin (b. 1929)
“Great wits are sure to madness near allied,
And thin partitions do their bounds divide.”
—John Dryden (16311700)