Position-specific Scoring Matrix - Information Content of A PWM

Information Content of A PWM

The information content (IC) of a PWM is sometimes of interest, as it says something about how different a given PWM is from a uniform distribution.

The self-information of observing a particular symbol at a particular position of the motif is:

The expected (average) self-information of a particular element in the PWM is then:

Finally, the IC of the PWM is then the sum of the expected self-information of every element:

Often, it is more useful to calculate the information content with the background letter frequencies of the sequences you are studying rather than assuming equal probabilities of each letter (e.g., the GC-content of DNA of thermophilic bacteria range from 65.3 to 70.8, thus a motif of ATAT would contain much more information than a motif of CCGG). The equation for information content thus becomes

where is the background frequency for that letter. This corresponds to the Kullback-Leibler divergence or relative entropy. However, it has been shown that when using PSSM to search genomic sequences (see below) this uniform correction can lead to overestimation of the importance of the different bases in a motif, due to the uneven distribution of n-mers in real genomes, leading to a significantly larger number of false positives.

Read more about this topic:  Position-specific Scoring Matrix

Famous quotes containing the words information and/or content:

    In the information age, you don’t teach philosophy as they did after feudalism. You perform it. If Aristotle were alive today he’d have a talk show.
    Timothy Leary (b. 1920)

    In America the taint of sectarianism lies broad upon the land. Not content with acknowledging the supremacy as the Diety, and with erecting temples in his honor, where all can bow down with reverence, the pride and vanity of human reason enter into and pollute our worship, and the houses that should be of God and for God, alone, where he is to be honored with submissive faith, are too often merely schools of metaphysical and useless distinctions. The nation is sectarian, rather than Christian.
    James Fenimore Cooper (1789–1851)