Structural Biochemistry/Nucleic Acid/Transcription

Transcription, also known as RNA synthesis, is a method in which a DNA nucleotide sequence is transcribed into RNA information. In this process, genetic information is simply copied from one molecule to another. In prokaryotic transcription, the mRNA genetic information is made and then translated to make proteins. In prokaryotes, translation and transcription can occur simultaneously in the cytoplasm. In eukaryotic transcription, the genetic material is transcribed in the nucleus. Transcription in eukaryotes is much more complex than in prokaryotes. One reason for this is the presence of histones in eurakyotic DNA. These histones tend to hinder the access of polymerases to the promoter. The process of transcription can be thought of as four sequential steps. The first would be the initiation step, during which the RNA polymerase II (RNAPII) binds to the DNA site in order to form a preinitiation complex with other transcriptional factors. The location of this on the DNA is identified as the "promoter." The second step involves an enzyme called a helicase that unwinds the DNA double helix. After the DNA is unwound, synthesis of RNA can begin based on the DNA template strand. It should be noted that Uracil of RNA is paired with Adenine of DNA. This step is called the elongation step, during which the polymerase leaves the promoter behind through a process called promoter clearance, and transcribes the rest of the DNA strand. The final step of transcription is termination of synthesis. There are different signals that lead to the termination of transcription. This step is also called the termination step, and the RNA polymerase finally releases the DNA.

Promoter SitesEdit

RNA transcription from DNA begins with the recognition of promoter sites on the DNA strand by RNA Polymerase. These promoter sites are designated base sequences that mark the beginning of transcription on the long DNA strand. Transcription for RNA from DNA does not simply begin anywhere on the DNA strand. The probability that transcription will begin at a desired location just by chance is very slim, thus requiring sequences that RNA polymerase can recognize and initiate transcription. Promoter sites on the DNA sequence provide these starting points for the synthesis of specific RNA sequences from specific genes on the DNA strand. The first nucleotide to be transcribed is numbered +1. The nucleotide upstream to +1 (adjacent to +1 on the 5' side) will be identified as -1.

Since RNA polymerase, the enzyme that synthesizes RNA from DNA, polymerizes RNA from the 5' to 3' end, the promoter site where it attaches is always upstream, meaning closer to the 5' end of the DNA, from the gene of interest. Oftentimes there are molecules that attach to the promoter site and subsequently recruit the RNA polymerase to attach there and begin transcription; these molecules are called transcription factors.

In bacteria, there are two distinct sequences upstream (5') to the first nucleotide to be transcribed that function as promoter sites and determine where transcription will begin. One of them is located at 10 nucleotides to the 5' end of the first nucleotide to be transcribed (-10 region) and is called the Pribnow box with the consensus sequences of "TATAAT". The other, located further upstream at the -35 region, has a consensus sequence of "TTGACA". Note that most often, the first nucleotide to be transcribed is a purine.

The proteins that guide RNA polymerase to genes are the sigma factors. A sigma factors binds RNA polymerase through the alpha subunit and then helps the core enzyme detect or a recognize a specific DNA sequence, this is called a promoter. A single bacteria species can also make several different sigma factors. They also help core RNA polymerase locate the consensus promoter sequences near the beginning of a gene.

Bacterial DNA template----------------TTGACA(-35)-----------TATAAT/Pribnow(-10)------------Start of RNA (+1)07:15, 21 November 2010 (UTC)07:15, 21 November 2010 (UTC)~~

In eukaryotes, the promoter site exists at the -25 region with a consensus sequences of "TATAAA". This sequences is called the "TATA box" or Hogness box. When the cell wants to transcribe the DNA strand, the "TATA binding protein" (transcription factor) attaches to the TATA box and subsequently helps in getting the RNA polymerase to attach there and begin synthesizing the RNA. In addition to the TATA box, most eukaryotes also have a second promoter site at the -75 region called the CAAT box with a consensus sequence of GGNCAATCT. Finally, RNA transcription in eukaryotes is also stimulated by the presence of enhancer sequences found in distant locations from the +1 region on either the 5' or 3' side.

Eukaryotic DNA template------------CAAT box(-75)/optional----------TATA box(-25)------------Start of RNA(+1)07:15, 21 November 2010 (UTC)07:15, 21 November 2010 (UTC)Anneyoh (talk) 07:15, 21 November 2010 (UTC)

Note: Not all base sequences of promoter sites are identical. They are called consensus seqeunces because they share common features, however almost all promoter sequences differ from the idealized consensus sequence by one or two bases.

Enzymes that replicate DNA do not rely solely on the sequence of bases when determining binding specificity. The three dimensional structure is also important in determining where replicating proteins will bind. For most DNA-binding proteins, the readout of base pairs through hydrogen bonds or hydrophobic contacts is not sufficient to explain specificity. The shape of the minor groove within a binding site can be “read” by a complementary set of basic side chains of DNA binding molecules, most typically arginines but also lysines, when presented in the correct conformation.

Kinks can contribute to binding specificity by creating conformations that enhance protein-DNA and protein-protein contacts. The DNA-binding site of the catabolite activator protein (CAP) shows large kinks at two steps which cause an overall bending of the DNA of about 90◦ around the protein . The kink at the steps creates a space for an arginine residue to engage in partial stacking interactions with a thymine at that site.

Challenges Associated with the Elongation StepEdit

As the RNAPII transcribes along the gene (or chromatin) during the transcript elongation, it has to find a way to deal with nucleosomes. One way of coping with the nucleosome is to disassemble it into separate histones and uncoiled gene before transcribing (as shown in the figure). Then, as the RNAPII transcribes along the uncoiled strand of gene, the separated histones may coil the gene to form a nucleosome back again. Histones are able to disassemble into further subunits, which include H2A/H2B dimer (depicted in red) and H3/H4 dimer (depicted in yellow). This disassembly of nucleosomes into histones is usually assisted by ATP-dependent chromatin remodelers and histone chaperones. Some of the identified ATP-dependent chromatin remodelers include SWI-SNF, ISWI, CHD, AND INO80/SWR. FACT (Facilitates Chromatin Transcription) is an example of Histone chaperone, which also plays a significant role in destabilizing the nucleosomes on a gene in order to facilitate the transcript elongation. Mainly, it functions by removing the H2A/H2B dimer from the nucleosome.

RNAPII Transcription

RNAPII also has to be certain on inserting the correct nucleotides. This is achieved by specific structure called trigger loop located under the active site of the RNAPII, where the nucleotides bind. The function of the trigger loop is to align the nucleotide in correct orientation for forming phosphodiester bond with the transcribing strand of gene. Only the right nucleotides are capable of aligning in correct orientation with specific trigger loops, which enable RNAPII to be certain on inserting the correct nucleotides.

The Mechanism of Elongation also plays a significant role in RNAPII fidelity. Transcript elongation is done by Brownian ratchet mechanism, which allows the RNAPII to move back and forth of the gene. By removing the misplaced one and inserting the correct one again, the RNAPII can not only increase the fidelity, but also enhance the rate of insertion of further nucleotides. This removal of a misplaced nucleotide usually requires general factors that encourage transcript cleavage, such as TFIIS.

In the elongation of RNA transcripts, the sigma factor remains assocated with the transcribing complex until about nine bases have been joined. [Microbiology]. The original RNA polymerase then continues to move along the template, and synthesized RNA at 45 base pairs per second.

Other Factors Affecting Transcript ElongationEdit

Histone modification is positively correlated with transcript elongation. In other words, transcription elongation requires increasing amount of histone modification in order to occur in faster rates. One of the histone modifications include histone acetylation, which is catalyzed by histone acetyltransferases (HATs) and histone deacetylases (HDACs). Histone methylation, which is another type of histone modification, interferes with the transcript elongation in order to regulate the rate of histone acetylation. It is believed that histone modification is associated with the disassembly of histones, and thus enhancing the transcript elongation.

Transcription Repression

Polycomb group proteins (PcG) are necessary for organisms to develop from cells to tissues. PcGs form protein complexes with many units that function as transcriptional repressors controlling thousands to hundreds of thousands of genes during cell differentiation and growth during normal development of the organism. Most multicellular organisms need PcGs for growth and development. Homeotic (HOX) genes are correctly expressed during development because of these same PcGs by regulating cell cycles, cancer, x-body inactivation, fate of cells, stem cell pathways and differentiation, among other developmental matters. As proteins, they also contain enzymatic like function when they target specific genes and thus downregulating their transcription. Their work includes the recruitment of other repressors that work together. PcG proteins can be best described as two parts: PRC1 and PRC 2. They serve unique purposes. PRC 1 catalyzes the ubiquitylation of histone H2Awhich leads to the repressing of gene transcription by making the chromatin more compacted and less available. PRC 2 serves as the catalyst force for the methylation of histone H3 in order to repress the zeste 12 and development of the ectoderm. PcG don’t attach similar sets of genes in all cells. The mechanisms and DNA patterns that regulate the binding of PcG proteins to the promoters in the cells have to be specific and complex. Like previously mentioned, PcG does not work alone- there are other factors and proteins which it recruits to help during gene transcription repression. BCL6 Co-repressor which is also known as BCOR, helps with transcription repression in people who suffer from oculofaciocardiodental (OFCD) disease. The complex of PcG and BCOR targets germ cell genes. The best known case that shows how signaling pathways help with the expressional methods of PcF is the hedgehog-signaling conserved pathway which is important during embryo development. This sonic hedgehog ligand (SHH) is best known for its work with cancer progression and stem cell maturation. More research is yet to be done.


Polycomb group protein-mediated repression of transcription Lluı´s Morey1,2 and Kristian Helin1,2 1 Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark 2 Centre for Epigenetics, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark

DNA Damage and MutagenesisEdit

When DNA is defected by UV, it is hindered from carrying on the transcription with RNAPII. As a result, damaged DNA may be repaired through the mechanism called transcription-coupled nucleotide excision DNA repair (TC-NER). Another way of repairing is polyubiquitylation of RNAPII. Both of these two repairing mechanisms involve the activity of RNAPII and degradation or removal of the damaged part of DNA.

Observations also show some relations between transcript elongation and mutagenesis (or simply mutation among the DNA strands). Although very little is discovered on the specific interactions between transcript elongation and mutagenesis, observations suggest that the increased rate of transcript elongation results in increased level of mutation among DNA strands. This probes a critical relation between transcription level and the fidelity of DNA replication.

Although a highly transcribed region spends a majority of its time being single-stranded, the rate of mutagenesis during DNA replication does not increase, but active transcription can interfere with the precision of DNA polymerase as it adds nucleotides to the template strand. The single-stranded DNA may not be protected by chromatin proteins and nucleosomes, but there is little evidence to argue that transcription is mutagenic to the DNA template strand.

In transcription-associated recombination (TAR), DNA polymerase and RNA polymerase II can produce a hybrid mRNA strand that contains both DNA and RNA nucleotides, these are called R-loops. R-loops lead to genetic instability as the cell has trouble during replication trying to activate the S Phase checkpoints. Mutants with the R-loops usually do not make it past the S phase and are not viable.

RECQL5 Helicase and Genomic StabilityEdit

The enzyme RECQL5 helicase may also have a role in maintaining genomic stability. RECQL5 is a protein that plays a role in preventing collapse or replication forks, which would lead to DNA damage, and the accumulation of DNA double-strand breaks, which would interfere with future replications and transcriptions if the mutation was in a coding region. Mutations in proteins similar to RECQL5 have lead to an increased rate of cancer.

Gene TrafficEdit

Sometimes more than one RNAPII may bind to the same strand of gene for transcript elongation. This promotes gene traffic among the polymerases, which may either cause decrease in the rate of transcript elongation or force the polymerases to move forward in faster rate. However, currently very little is known on this phenomenon such as how and why the traffic causes the way polymerases react to the traffic. Some hypothesize that the main cause is directly related to the frequent collisions among the polymerases resulting from elongating at different rates on the same strand.

tRNA roles in transcriptionEdit

tRNA is an RNA molecule and is thus transcribed from DNA, other than this it has little to do with transcription. The primary role of tRNA lies in translation where it interacts with the mature mRNA to bring the appropriate amino acid which it carries to the growing polypeptide chain.