Gene Prediction - Extrinsic Approaches

Extrinsic Approaches

In extrinsic (or evidence-based) gene finding systems, the target genome is searched for sequences that are similar to extrinsic evidence in the form of the known sequence of a messenger RNA (mRNA) or protein product. Given an mRNA sequence, it is trivial to derive a unique genomic DNA sequence from which it had to have been transcribed. Given a protein sequence, a family of possible coding DNA sequences can be derived by reverse translation of the genetic code. Once candidate DNA sequences have been determined, it is a relatively straightforward algorithmic problem to efficiently search a target genome for matches, complete or partial, and exact or inexact. BLAST is a widely used system designed for this purpose.

A high degree of similarity to a known messenger RNA or protein product is strong evidence that a region of a target genome is a protein-coding gene. However, to apply this approach systemically requires extensive sequencing of mRNA and protein products. Not only is this expensive, but in complex organisms, only a subset of all genes in the organism's genome are expressed at any given time, meaning that extrinsic evidence for many genes is not readily accessible in any single cell culture. Thus, in order to collect extrinsic evidence for most or all of the genes in a complex organism, many hundreds or thousands of different cell types must be studied, which itself presents further difficulties. For example, some human genes may be expressed only during development as an embryo or fetus, which might be difficult to study for ethical reasons.

Despite these difficulties, extensive transcript and protein sequence databases have been generated for human as well as other important model organisms in biology, such as mice and yeast. For example, the RefSeq database contains transcript and protein sequence from many different species, and the Ensembl system comprehensively maps this evidence to human and several other genomes. It is, however, likely that these databases are both incomplete and contain small but significant amounts of erroneous data..

Read more about this topic:  Gene Prediction

Famous quotes containing the words extrinsic and/or approaches:

    Authors communicate with the people by some special extrinsic mark; I am the first to do so by my entire being, as Michel de Montaigne.
    Michel de Montaigne (1533–1592)

    No one ever approaches perfection except by stealth, and unknown to themselves.
    William Hazlitt (1778–1830)