Multiple EM For Motif Elicitation - Example

Example

In the following example, one has a weight matrix of 3 different sequences, without gaps.

1: C G G G T A A G T
2: A A G G T A T G C
3: C A G G T G A G G

Now one counts the number of nucleotides contained in all sequences:

A: 1 2 0 0 0 2 2 0 0 7
C: 2 0 0 0 0 0 0 0 1 3
G: 0 1 3 3 0 1 0 3 1 12
T: 0 0 0 0 3 0 1 0 1 5

Now one needs to sum up the total: 7+3+12+5 = 27; this gives us a "dividing factor" for each base, or the equivalent probability of each nucleotides.
A: 7/27 = 0.26
C: 3/27 = 0.11
G: 12/27 = 0.44
T: 5/27 = 0.19

Now one can "redo" the weight matrix (WM) by dividing it by the total number of sequences (in our case 3):
A: 0.33 0.66 0.00 0.00 0.00 0.66 0.66 0.00 0.00
C: 0.66 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.33
G: 0.00 0.33 1.00 1.00 0.00 0.33 0.00 1.00 0.33
T: 0.00 0.00 0.00 0.00 1.00 0.00 0.33 0.00 0.33

Next, one divides the entries of the WM at position xi with the probability of the base x.
A: 1.27 2.30 0.00 0.00 0.00 2.30 2.30 0.00 0.00
C: 6.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00
G: 0.00 0.75 2.27 2.27 0.00 0.75 0.00 2.27 0.75
T: 0.00 0.00 0.00 0.00 5.26 0.00 1.74 0.00 1.74

In general one would now multiply the probabilities. In our case one would have zero for every one. Due to this we take the logarithm and define log(0)=(-10):

A: 0.10 0.36 -10 -10 -10 0.36 0.36 -10 -10
C: 0.78 -10 -10 -10 -10 -10 -10 -10 0.48
G: -10 -0.1 0.36 0.36 -10 -0.1 -10 0.36
T: -10 -10 -10 -10 0.72 -10 0.24 -10 0.24

This is our new weight matrix (WM). One is ready to use an example of a promoter sequence to determine its score. To do this, one has to add the numbers found at the position xi of the logarithmic WM. For instance, if one takes the AGGCTGATC promoter:
0.10 - 0.1 + 0.36 - 10 + 0.72 - 0.1 + 0.36 - 10 + 0.48 = -18.18
This is then divided by the number of entries (in our case 9) yielding a score of -2.02.

Read more about this topic:  Multiple EM For Motif Elicitation

Famous quotes containing the word example:

    Our intellect is not the most subtle, the most powerful, the most appropriate, instrument for revealing the truth. It is life that, little by little, example by example, permits us to see that what is most important to our heart, or to our mind, is learned not by reasoning but through other agencies. Then it is that the intellect, observing their superiority, abdicates its control to them upon reasoned grounds and agrees to become their collaborator and lackey.
    Marcel Proust (1871–1922)