Information Content of A PWM
The information content (IC) of a PWM is sometimes of interest, as it says something about how different a given PWM is from a uniform distribution.
The self-information of observing a particular symbol at a particular position of the motif is:
The expected (average) self-information of a particular element in the PWM is then:
Finally, the IC of the PWM is then the sum of the expected self-information of every element:
Often, it is more useful to calculate the information content with the background letter frequencies of the sequences you are studying rather than assuming equal probabilities of each letter (e.g., the GC-content of DNA of thermophilic bacteria range from 65.3 to 70.8, thus a motif of ATAT would contain much more information than a motif of CCGG). The equation for information content thus becomes
where is the background frequency for that letter. This corresponds to the Kullback-Leibler divergence or relative entropy. However, it has been shown that when using PSSM to search genomic sequences (see below) this uniform correction can lead to overestimation of the importance of the different bases in a motif, due to the uneven distribution of n-mers in real genomes, leading to a significantly larger number of false positives.
Read more about this topic: Position-specific Scoring Matrix
Famous quotes containing the words information and/or content:
“So while it is true that children are exposed to more information and a greater variety of experiences than were children of the past, it does not follow that they automatically become more sophisticated. We always know much more than we understand, and with the torrent of information to which young people are exposed, the gap between knowing and understanding, between experience and learning, has become even greater than it was in the past.”
—David Elkind (20th century)
“Women are angels, wooing;
Things won are done, joys soul lies in the doing.
That she beloved knows naught that knows not this:
Men prize the thing ungained more than it is.
That she was never yet that ever knew
Love got so sweet as when desire did sue.
Therefore this maxim out of love I teach:
Achievement is command; ungained, beseech.
Then though my hearts content firm love doth bear,
Nothing of that shall from mine eyes appear.”
—William Shakespeare (15641616)