Multiple Sequence Alignment - Motif Finding

Motif Finding

Motif finding, also known as profile analysis, is a method of locating sequence motifs in global MSAs that is both a means of producing a better MSA and a means of producing a scoring matrix for use in searching other sequences for similar motifs. A variety of methods for isolating the motifs have been developed, but all are based on identifying short highly conserved patterns within the larger alignment and constructing a matrix similar to a substitution matrix that reflects the amino acid or nucleotide composition of each position in the putative motif. The alignment can then be refined using these matrices. In standard profile analysis, the matrix includes entries for each possible character as well as entries for gaps. Alternatively, statistical pattern-finding algorithms can identify motifs as a precursor to an MSA rather than as a derivation. In many cases when the query set contains only a small number of sequences or contains only highly related sequences, pseudocounts are added to normalize the distribution reflected in the scoring matrix. In particular, this corrects zero-probability entries in the matrix to values that are small but nonzero.

Blocks analysis is a method of motif finding that restricts motifs to ungapped regions in the alignment. Blocks can be generated from an MSA or they can be extracted from unaligned sequences using a precalculated set of common motifs previously generated from known gene families. Block scoring generally relies on the spacing of high-frequency characters rather than on the calculation of an explicit substitution matrix. The BLOCKS server provides an interactive method to locate such motifs in unaligned sequences.

Statistical pattern-matching has been implemented using both the expectation-maximization algorithm and the Gibbs sampler. One of the most common motif-finding tools, known as MEME, uses expectation maximization and hidden Markov methods to generate motifs that are then used as search tools by its companion MAST in the combined suite MEME/MAST.

Read more about this topic:  Multiple Sequence Alignment

Famous quotes containing the word finding:

    Panurge was of medium stature, neither too large, nor too small ... and subject by nature to a malady known at the time as “Money-deficiency,”Ma singular hardship; nevertheless, he had sixty-three ways of finding some for his needs, the most honorable and common of which was by a form of larceny practiced furtively.
    François Rabelais (1494–1553)