Genome Sequencing and Statistics
The T. vaginalis genome was found to be approximately 160 megabases in size – ten times larger than predicted from earlier gel-based chromosome sizing (The human genome is ~3.5 gigabases by comparison.) As much as two-thirds of the T. vaginalis sequence consists of repetitive and transposable elements, reflecting a massive, evolutionarily-recent expansion of the genome. The total number of predicted protein-coding genes is ~98,000, which includes ~38,000 'repeat' genes (virus-like, transposon-like, retrotransposon-like, and unclassified repeats, all with high copy number and low polymorphism). Approximately 26,000 of the protein-coding genes have been classed as 'evidence-supported' (similar either to known proteins, or to ESTs), while the remainder have no known function. These extraordinary genome statistics are likely to change downward as the genome sequence, currently very fragmented due to the difficulty of ordering repetitive DNA, is assembled into chromosomes, and as more transcription data (ESTs, microarrays) accumulate. But it appears that the gene number of the single-celled parasite T. vaginalis is, at minimum, on par with that of its host H. sapiens.
In late 2007 TrichDB.org was launched as a free, public genomic data repository and retrieval service devoted to genome-scale trichomonad data. The site currently contains all of the T. vaginalis sequence project data, several EST libraries, and tools for data mining and display. TrichDB is part of the NIH/NIAID-funded EupathDB functional genomics database project.
Read more about this topic: Trichomonas Vaginalis
Famous quotes containing the word statistics:
“Maybe a nation that consumes as much booze and dope as we do and has our kind of divorce statistics should pipe down about character issues. Either that or just go ahead and determine the presidency with three-legged races and pie-eating contests. It would make better TV.”
—P.J. (Patrick Jake)