RNA polymerase

Overview
RNA polymerase (RNAP or RNApol) is an enzyme that makes an RNA copy of a DNA or RNA template. In cells, RNAP is needed for constructing RNA chains from DNA genes, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses. In chemical terms, RNAP is a nucleotidyl transferase that polymerizes ribonucleotides at the 3' end of an RNA transcript.

History
RNAP was discovered independently by Sam Weiss and Jerard Hurwitz in 1960. By this time the 1959 Nobel Prize in Medicine had been awarded to Severo Ochoa and Arthur Kornberg for the discovery of what was believed to be RNAP, but instead turned out to be a ribonuclease.

The 2006 Nobel Prize in Chemistry was awarded to Roger Kornberg for creating detailed molecular images of RNA polymerase during various stages of the transcription process.

Control of transcription


Control of the process of gene transcription affects patterns of gene expression and thereby allows a cell to adapt to a changing environment, perform specialized roles within an organism, and maintain basic metabolic processes necessary for survival. Therefore, it is hardly surprising that the activity of RNAP is both complex and highly regulated. In Escherichia coli bacteria, more than 100 factors have been identified which modify the activity of RNAP.

RNAP can initiate transcription at specific DNA sequences known as promoters. It then produces an RNA chain which is complementary to the template DNA strand. The process of adding nucleotides to the RNA strand is known as elongation; In eukaryotes, RNAP can build chains as long as 2.4 million nucleosides (the full length of the dystrophin gene). RNAP will preferentially release its RNA transcript at specific DNA sequences encoded at the end of genes known as terminators.

Products of RNAP include:
 * Messenger RNA (mRNA)&mdash;template for the synthesis of proteins by ribosomes.
 * Non-coding RNA or "RNA genes"&mdash;a broad class of genes that encode RNA that is not translated into protein. The most prominent examples of RNA genes are transfer RNA (tRNA) and ribosomal RNA (rRNA), both of which are involved in the process of translation. However, since the late 1990s, many new RNA genes have been found, and thus RNA genes may play a much more significant role than previously thought.
 * Transfer RNA (tRNA)&mdash;transfers specific amino acids to growing polypeptide chains at the ribosomal site of protein synthesis during translation
 * Ribosomal RNA (rRNA)&mdash;a component of ribosomes
 * Micro RNA&mdash;regulates gene activity
 * Catalytic RNA (Ribozyme)&mdash;enzymatically active RNA molecules

RNAP accomplishes de novo synthesis. It is able to do this because specific interactions with the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts with ATP (followed by GTP, UTP, and then CTP). In contrast to DNA polymerase, RNAP includes helicase activity, therefore no separate enzyme is needed to unwind DNA.

Binding and initiation
RNA Polymerase binding involves the α subunit recognizing the upstream element (-40 to -70 base pairs) in DNA, as well as the σ factor recognizing the -10 to -35 region. There are numerous σ factors that regulate gene expression. For example, σ70 is expressed under normal conditions and allows RNAP binding to house-keeping genes, while σ32 elicits RNAP binding to heat-shock genes.

After binding to the DNA, the RNA polymerase switches from a closed complex to an open complex. This change involves the separation of the DNA strands to form a unwound section of DNA of approximately 13bp. Ribonucleotides are base-paired to the template DNA strand, according to Watson-Crick base-pairing interactions. Supercoiling plays an important part in polymerase activity because of the unwinding and rewinding of DNA. Because regions of DNA in front of RNAP are unwound, there is compensatory positive supercoils. Regions behind RNAP are rewound and negative supercoils are present.

Elongation
Transcription elongation involves the further addition of ribonucleotides and the change of the open complex to the transcriptional complex. RNAP cannot start forming full length transcripts because of its strong binding to promoter. Transcription at this stage primarily results in short RNA fragments of around 9 bp in a process known as abortive transcription. Once the RNAP starts forming longer transcripts it clears the promoter. At this point, the -10 to -35 promoter region is disrupted, and the σ factor falls off RNAP. This allows the rest of the RNAP complex to move forward, as the σ factor held the RNAP complex in place.

The 17 bp transcriptional complex has an 8 bp DNA-RNA hybrid, that is, 8 base-pairs involve the RNA transcript bound to the DNA template strand. As transcription progresses, ribonucleotides are added to the 3' end of the RNA transcript and the RNAP complex moves along the DNA. Although RNAP does not seem to have the 3'exonuclease activity that characterizes the proofreading activity found in DNA polymerase, there is evidence of that RNAP will halt at mismatched base-pairs and correct it.

The addition of ribonucleotides to the RNA transcript has a very similar mechanism to DNA polymerization - it is believed that these polymerases are evolutionarily related. Aspartyl (asp) residues in the RNAP will hold onto Mg2+ ions, which will in turn coordinate the phosphates of the ribonucleotides. The first Mg2+ will hold onto the α-phosphate of the NTP to be added. This allows the nucleophilic attack of the 3'OH from the RNA transcript, adding an additional NTP to the chain. The second Mg2+ will hold onto the pyrophosphate of the NTP. The overall reaction equation is:

(NMP)n + NTP --> (NMP)n+1 + PPi

Termination
Termination of RNA transcription can be rho-independent or rho-dependent:

Rho-independent transcription termination is the termination of transcription without the aid of the rho protein. Transcription of a palindromic region of DNA causes the formation of a hairpin structure from the RNA transcription looping and binding upon itself. This hairpin structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid itself. As a result, the 8bp DNA-RNA hybrid in the transcription complex shifts to a 4bp hybrid. Coincidentally, these last 4 base-pairs are weak A-U base-pairs, and the entire RNA transcript will fall off.

RNA polymerase in bacteria
In bacteria, the same enzyme catalyzes the synthesis of mRNA and ncRNA.

RNAP is a relatively large molecule. The core enzyme has 5 subunits (~400 kDa):
 * α2: the two α subunits assemble the enzyme and recognize regulatory factors. Each subunit has two domains: αCTD (C-Terminal domain) binds the UP element of the extended promoter, and αNTD (N-terminal domain) binds the rest of the polymerase.
 * β: this has the polymerase activity (catalyzes the synthesis of RNA) which includes chain initiation and elongation.
 * β': binds to DNA (nonspecifically).
 * ω: restores denatured RNA polymerase to its functional form in vitro. It has been observed to offer a protective/chaperone function to the β' subunit in Mycobacterium smegmatis. Now known to promote assembly.

In order to bind promoter-specific regions, the core enzyme requires another subunit, sigma (σ). The sigma factor greatly reduces the affinity of RNAP for nonspecific DNA while increasing specificity for certain promoter regions, depending on the sigma factor. That way, transcription is initiated at the right region. The complete holoenzyme therefore has 6 subunits: α2ββ'σω (~480 kDa). The structure of RNAP exhibits a groove with a length of 55 Å (5.5 nm) and a diameter of 25 Å (2.5 nm). This groove fits well the 20 Å (2 nm) double strand of DNA. The 55 Å (5.5 nm) length can accept 16 nucleotides.

When not in use RNA polymerase binds to low affinity sites to allow rapid exchange for an active promoter site when one opens. RNA polymerase holoenzyme, therefore, does not freely float around in the cell when not in use.

Transcriptional cofactors
There are a number of proteins which can bind to RNAP and modify its behavior. For instance, greA and greB from E. coli can enhance the ability of RNAP to cleave the RNA template near the growing end of the chain. This cleavage can rescue a stalled polymerase molecule, and is likely involved in proofreading the occasional mistakes made by RNAP. A separate cofactor, Mfd, is involved in transcription-coupled repair, the process in which RNAP recognizes damaged bases in the DNA template and recruits enzymes to restore the DNA. Other cofactors are known to play regulatory roles, i.e. they help RNAP choose whether or not to express certain genes.

RNA polymerase in eukaryotes
Eukaryotes have several types of RNAP, characterized by the type of RNA they synthesize:
 * RNA polymerase I synthesizes a pre-rRNA 45S, which matures into 28S, 18S and 5.8S rRNAs which will form the major RNA sections of the ribosome.
 * RNA polymerase II synthesizes precursors of mRNAs and most snRNA and microRNAs. This is the most studied type, and due to the high level of control required over transcription a range of transcription factors are required for its binding to promoters.
 * RNA polymerase III synthesizes tRNAs, rRNA 5S and other small RNAs found in the nucleus and cytosol.

There are other RNA polymerase types in mitochondria and chloroplasts.

RNA polymerase in archaea
Archaea have a single RNAP that is closely related to the three main eukaryotic polymerases. Thus, it has been speculated that the archaeal polymerase resembles the ancestor of the specialized eukaryotic polymerases.

RNA polymerase in viruses
Many viruses also encode for RNAP. Perhaps the most widely studied viral RNAP is found in bacteriophage T7. This single-subunit RNAP is related to that found in mitochondria and chloroplasts, and shares considerable homology to DNA polymerase. It is believed that most viral polymerases therefore evolved from DNA polymerase and are not directly related to the multi-subunit polymerases described above.

The viral polymerases are diverse, and include some forms which can use RNA as a template instead of DNA. This occurs in negative strand RNA viruses and dsRNA viruses, both of which exist for a portion of their life cycle as double-stranded RNA. However, some positive strand RNA viruses, such as polio, also contain these RNA dependent RNA polymerases.

Transcription Initiation
The carboxy-terminal domain (CTD) of RNA polymerase II is that portion of the polymerase which is involved in the initiation of DNA transcription. The CTD typically consists of up to 52 repeats of the sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser. The transcription factor TFIIH is a kinase and will hyperphosphorylate the CTD of RNAP, and in doing so, causes the RNAP complex to move away from the initiation site.

5'Capping
The carboxy-terminal domain is also the binding site of the cap-synthesizing and cap-binding complex. In eukaryotes, after transcription of the 5' end of an RNA transcript, the cap-synthesizing complex on the CTD will remove the gamma-phosphate from the 5'phosphate and attach a GMP, forming a 5',5'-triphosphate linkage. The synthesizing complex falls off and the cap then binds to the cap-binding complex (CBC), which is bound to the CTD.

The 5'cap of eukaryotic RNA transcripts is important for binding of the RNA transcript to the ribosome during translation, to the CTD of RNAP, and prevents RNA degradation.

Spliceosome
The carboxy-terminal domain is also the binding site for spliceosome factors that are part of RNA splicing. These allow for the splicing and removal of introns (in the form of a lariat structure) during RNA transcription.

Mutation in the CTD
Major studies have been carried out in which knockout of particular amino acids was achieved in the CTD. The results indicate that RNA polymerase II CTD truncation mutations affect the ability to induce transcription of a subset of genes in vivo, and the lack of response to induction maps to the upstream activating sequences of these genes.

RNA polymerase purification
RNA polymerase can be isolated in the following ways:
 * By a phosphocellulose column.
 * By glycerol gradient centrifugation.
 * By a DNA column.
 * By an Ion exchange column.

And also combinations of the above techniques.