Transcription (genetics)

Transcription is the process through which a DNA sequence is enzymatically copied by an RNA polymerase to produce a complementary RNA. Or, in other words, the transfer of genetic information from DNA into RNA. In the case of protein-encoding DNA, transcription is the beginning of the process that ultimately leads to the translation of the genetic code (via the mRNA intermediate) into a functional peptide or protein. Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for DNA; therefore, transcription has a lower copying fidelity than DNA replication.

Like DNA replication, transcription proceeds in the 5' → 3' direction (ie the old polymer is read in the 3' → 5' direction and the new, complementary fragments are generated in the 5' → 3' direction). Transcription is divided into 3 stages: initiation, elongation and termination.

Prokaryotic transcription

 * Occurs in the cytoplasm alongside translation; translation begins before transcription has been completed

Initiation
The followings steps occur, in order, for transcription initiation:
 * RNA polymerase (RNAP) recognizes and specifically binds to the promoter region on DNA. At this stage, the DNA is double-stranded ("closed"). This RNAP/wound-DNA structure is referred to as the closed complex.
 * The DNA is unwound and becomes single-stranded ("open") in the vicinity of the initiation site (defined as +1). This RNAP/unwound-DNA structure is called the open complex.
 * The RNA polymerase transcribes the DNA, but produces about 10 abortive (short, non-productive) transcripts which are unable to leave the RNA polymerase because the exit channel is blocked by the σ-factor.
 * The σ-factor eventually dissociates from the holoenzyme, and elongation proceeds.

Most transcripts originate using adenosine-5'-triphosphate (ATP) and, to a lesser extent, guanosine-5'-triphosphate (GTP) (purine nucleoside triphosphates) at the +1 site. Uridine-5'-triphosphate (UTP) and cytidine-5'-triphosphate (CTP) (pyrimidine nucleoside triphosphates) are disfavoured at the initiation site.

Elongation
The RNA polymerase runs along the DNA, synthesizing the complementary RNA in the process. In prokaryotes, the nascent mRNA is translated co-transcriptionally by ribosomes.

Some proofreading occurs during this process:
 * Pyrophosphorolytic editing - RNA polymerase immediately removes incorrect pairs reversing the reaction that put them together.
 * Hydrolytic editing - RNA polymerase backtracks one or more bases to remove an incorrect pair, stimulated by Gre factors.

Termination
Two termination mechanisms are well known:
 * Intrinsic termination (also called Rho-independent termination) involves terminator sequences within the RNA that signal the RNA polymerase to stop. The terminator sequence is usually a palindromic sequence that forms a stem-loop hairpin structure that leads to the dissociation of the RNAP from the DNA template.
 * Rho-dependent termination uses a termination factor called ρ factor to stop RNA synthesis at specific sites. This protein binds and runs along the mRNA towards the RNAP. When ρ-factor reaches the RNAP, it causes RNAP to dissociate from the DNA, terminating transcription.

Other termination mechanisms include where RNAP comes across a region with repetitious thymidine residues in the DNA template. or where a GC-rich inverted repeat followed by 4 A residues. the inverted repeat forms a stable stem loop structure in the Rna, which causes the RNA to dissociate from the DNA template.

where the -35 region and the -10 ("Pribnow box") region comprise the basic prokaryotic promoter, and |T| stands for the terminator. The DNA on the template strand between the +1 site and the terminator is transcribed into RNA, which is then translated into protein.

Promoters can differ in "strength"; that is, how actively they promote transcription of their adjacent DNA sequence. Promoter strength is in many (but not all) cases, a matter of how tightly RNA polymerase and its associated accessory proteins bind to their respective DNA sequences. The more similar the sequences are to a consensus sequence, the stronger the binding is.

Eukaryotic transcription
Eukaryotes have evolved much more complex transcriptional regulatory mechanisms than prokaryotes. For instance, in eukaryotes the genetic material (DNA), and therefore transcription, is localized to the nucleus, where it is separated from the cytoplasm (where translation occurs) by the nuclear membrane. This allows for the temporal regulation of gene expression through the sequestration of the RNA in the nucleus, and allows for selective transport of RNAs to the cytoplasm, where the ribosomes reside.

Adding to this complexity, eukaryotes have three RNA polymerases, each with distinct roles and properties:


 * RNA Polymerase I is located in the nucleolus and transcribes ribosomal RNA (rRNA).
 * RNA Polymerase II is localized to the nucleus, and transcribes messenger RNA (mRNA) and most small nuclear RNAs (snRNAs).
 * RNA Polymerase III transcribes transfer RNA (tRNA) and other small RNAs.

Further complexity is added by the multitude of transcripton factors and signaling pathways that may interact in combination to mediate cell-type and developmental transcriptional regulation.

The basal eukaryotic transcription complex includes the RNA polymerase and additional proteins that are necessary for correct initiation and elongation.

Primary (initial) mRNA transcripts in eukaryotic cells are synthesized as larger precursor RNAs that are processed by splicing out introns (non-coding sequences) and ligating exons (non-contiguous coding sequences) into the mature mRNA. Primary transcripts for some genes can be large. The primary transcripts of the neurexin genes, for instance, are as large as 1.7 megabases (1,700,000 bases), while the mature (processed) neurexin mRNAs are under 10 kilobases (10,000 bases), with as many as 24 exons and thousands of possible alternative splice variants that produce proteins with different activities.

Gene expression in eukaryotes is also controlled by complex interactions between cis-acting elements within the regulatory regions of the DNA, and trans-acting factors that include transcription factors and the basal transcription complex.

Initiation
The core promoter of protein-encoding genes also contains binding sites for the basal transcription complex and RNA polymerase II, and is normally within about 50 bases upstream of the transcription initiation site. Further transcriptional regulation is provided by upstream control elements (UCEs), usually present within about 200 bases upstream of the initiation site. The core promoter for RNAP II normally (though not always) contains a TATA box, the highly conserved DNA sequence

Some genes also have enhancer elements that can be thousands of bases upstream or downstream of the transcription initiation site. Combinations of these upstream control elements and enhancers regulate and amplify the formation of the basal transcription complex.

Transcription Process
For the pathway and process of construction of the transcription complex please see the individual polymerases: RNA polymerase I RNA polymerase II RNA polymerase III

Measuring and detecting transcription
Transcription can be measured and detected in a variety of ways:
 * Northern blot
 * RNase protection assay
 * RT-PCR
 * In vitro transcription
 * In situ hybridization

History
RNA synthesis by RNA polymerase had been established in vitro by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.

By the late 1960s several papers that came out of the Harvard University Biological Laboratories established the basic mechanics of gene expression in bacteria.

Terminology

 * Activator, is a DNA-binding protein that regulates one or more genes by increasing the rate of transcription
 * Repressor, is a DNA-binding protein that regulates one or more genes by decreasing the rate of transcription

Reverse transcription
Some viruses have the ability to transcript RNA into DNA in order to infect the cell's genome. The main enzyme responsible for this type of transcription is the reverse transcriptase. It often causes the viral genome to be replicated along the cell's genome because of the constant activity of the revertase inside the cell.