Sequence Assembly - EST Assemblers

EST Assemblers

Expressed Sequence Tag or EST assembly differs from genome assembly in several ways. The sequences for EST assembly are the transcribed mRNA of a cell and represent only a subset of the whole genome. At a first glance, underlying algorithmical problems differ between genome and EST assembly. For instance, genomes often have large amounts of repetitive sequences, mainly in the inter-genic parts. Since ESTs represent gene transcripts, they will not contain these repeats. On the other hand, cells tend to have a certain number of genes that are constantly expressed in very high amounts (housekeeping genes), which again leads to the problem of similar sequences present in high amounts in the data set to be assembled.

Furthermore, genes sometimes overlap in the genome (sense-antisense transcription), and should ideally still be assembled separately. EST assembly is also complicated by features like (cis-) alternative splicing, trans-splicing, single-nucleotide polymorphism, recoding, and post-transcriptional modification.

Read more about this topic:  Sequence Assembly

Famous quotes containing the word est:

    I preche of nothing but for coveityse.
    Therfor my theme is yet, and ever was—
    Radix malorum est cupiditas.
    Geoffrey Chaucer (1340?–1400)