Transcription (genetics)

Transcription (genetics)

Transcription is the first step of gene expression, in which a particular segment of DNA is copied into RNA by the enzyme RNA polymerase. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes. During transcription, a DNA sequence is read by an RNA polymerase, which produces a complementary, antiparallel RNA strand. As opposed to DNA replication, transcription results in an RNA complement that includes uracil (U) in all instances where thymine (T) would have occurred in a DNA complement. Also unlike DNA replication where DNA is synthesised, transcription does not involve an RNA primer to initiate RNA synthesis.

Transcription is explained easily in 4 or 5 steps, each moving like a wave along the DNA.

  1. Helicase moves the transcription bubble, like the slider of a zipper, to split the double helix DNA molecule into two strands of unpaired DNA nucleotides, by breaking the hydrogen bonds between complementary DNA nucleotides.
  2. RNA polymerase adds matching RNA nucleotides that are paired with complementary DNA nucleotides of one DNA strand.
  3. RNA sugar-phosphate backbone forms with assistance from RNA polymerase to form an RNA strand.
  4. Hydrogen bonds of the untwisted RNA + DNA helix break, freeing the newly synthesized RNA strand.
  5. If the cell has a nucleus, the RNA is further processed (addition of a 3'UTR poly-A tail and a 5'UTR cap) and exits through to the cytoplasm through the nuclear pore complex.

Transcription is the first step leading to gene expression. The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene. If the gene transcribed encodes a protein, the result of transcription is messenger RNA (mRNA), which will then be used to create that protein via the process of translation. Alternatively, the transcribed gene may encode for either non-coding RNA genes (such as microRNA, lincRNA, etc.) or ribosomal RNA (rRNA) or transfer RNA (tRNA), other components of the protein-assembly process, or other ribozymes.

A DNA transcription unit encoding for a protein contains not only the sequence that will eventually be directly translated into the protein (the coding sequence) but also regulatory sequences that direct and regulate the synthesis of that protein. The regulatory sequence before (upstream from) the coding sequence is called the five prime untranslated region (5'UTR), and the sequence following (downstream from) the coding sequence is called the three prime untranslated region (3'UTR).

Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.

As in DNA replication, DNA is read from 3'UTR → 5'UTR during transcription. Meanwhile, the complementary RNA is created from the 5'UTR → 3'UTR direction. This means its 5' end is created first in base pairing. Although DNA is arranged as two antiparallel strands in a double helix, only one of the two DNA strands, called the template strand, is used for transcription. This is because RNA is only single-stranded, as opposed to double-stranded DNA. The other DNA strand is called the coding (lagging) strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). The use of only the 3'UTR → 5'UTR strand eliminates the need for the Okazaki fragments seen in DNA replication.

Transcription is divided into 5 stages: pre-initiation, initiation, promoter clearance, elongation and termination.

Read more about Transcription (genetics):  Measuring and Detecting Transcription, Transcription Factories, History, Reverse Transcription, Inhibitors