Structural Biochemistry/Volume 8

Nucleic Acids are long linear polymers that are called DNA, RNA. these polymers carry genetic information that passed from generations after generations. They are composed of three main parts: a pentose sugar, a phosphate group, and a nitrogenous base. Sugars and Phosphates groups play as structure of the backbone, while bases carries genetic components, which characterized the differences of nucleic acids. There are 2 types of bases: purines and pyrimidines, and these bases determine whether the nucleic acid is DNA or RNA.

A conceptualized depiction of multiple nucleic acids. Green circles represent the pentose sugars, red circles represent the nucleobases, and the yellow circles represent the phosphate groups. Note that a single nucleic acid consists of one sugar, one base, and one phosphate group

Nucleic acids are composed of smaller subunits called nucleotides. A nucleotide is a nucleoside with one or more phosphoryl group by esterlinkage. When it is in the form of RNA the bases are called adenylate, guanylate, cytidylate, and uridylate. In the form of DNA the bases are called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate. A nucleoside is a monomer, just the bases attached to a sugar without the phosphate groups. In this state the bases in RNA are called adenosine, guanosine, cytidine and uridine. In this state in DNA the bases are called deoxyadenosine, deoxyguanosine, deoxycytidine and thymidine.

In organic chemistry, a phosphate, or organophosphate, is an ester of phosphoric acid. Organic phosphates are important in biochemistry and biogeochemistry.

General Phosphate structure

The backbone of the DNA strand is made from alternating phosphate and sugar residues. The sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings.

As you noticed in the deoxyribose sugar, it does not contain a hydroxyl group on the 2' carbon. This absence of the hydroxyl group allows greater stability because the absence of hydroxyl group allows the 2' carbon to resist hydrolysis. This is one of the reasons why the hereditary material is stored in the DNA and not RNA. However, the net negative charge of the phosphate group must be stabilized by metal ions, such as magnesium or manganese.

Phosphodiester bond


In the molecular bonding of the deoxyribonucleotide (DNA) and ribonucleotide(RNA), phosphodiester bond is a strong covalent bond between a phosphate group and two 5-carbon ring. The phosphate group contains a negative charge as it bonds to a 3' carbon in one ring and a 5' carbon in another ring.

Phosphodiester linkage

The phosphodiester is formed when a single phosphate or two phosphates break away and catalyze the reaction by DNA polymerase. dATP would dissociate one phosphates in order to form a phosphodiester bond with a deoxyribose sugar from a nucleotide during the process of DNA elongation.

(DNA)n + dATP <------> (DNA) n+1 + Ppi

Phosphodiesterase is an enzyme that breaks a cyclic nucleotide phosphate due to incorrect hydrolysis of phosphodiester bonds. Phosphodiesterase will be an important clinical significance in repairing DNA sequences.



Carbohydrates are comprised of monosaccharide units which create sugars ranging from simplest of sugars such as glucose (chemical formula: C6H12O6) to the more complex polysaccharides such as starch. Single nucleotide monomeric units consist of one sugar molecule connected to 1) a heterocyclic nitrogen containing organic base, and 2) a Phosphate group that connects the sugar component of different nucleotides together. The organic base is usually attached to Carbon 1' of the sugar, while the Phosphate group is connected to Carbon 5' of the sugar. When strung together, the phosphate of the neighboring nucleotide attaches to Carbon 3' of the sugar.

Monosaccharides consist of aldehyde or ketone groups with hydroxyl groups as substituents. Sugars that contain an aldehyde group are called aldoses, and the sugars that contain a ketone group are called ketoses.

Sugars that are non-super imposable mirror images of each other are called enantiomers. Sugars that are stereoisomers but mirror images of each other are called diastereoisomers. If sugars that are stereoisomers but differ in configuration at a single chiral center are called epimers.

Sugars can be open-chain form or ring form. To form a six-membered hemiacetal ring, the carbon in the aldehyde group (C-1) attaches to the oxygen atom in the C-5 hydroxyl group. The six membered cyclic hemiacetal is called pyranose because it is similar to the structure of a pyran. To form a five-membered ring, the C-2 of ketone group attacks the oxygen atom of the hydroxyl group on C-6. The five membered cyclic hemiacetal is called furanose because it is similar to the structure of a furan. When a furanose or pyranose ring is formed, a new stereocenter is formed, and this new chiral carbon is called the anomeric carbon. This carbon can have one of two configurations, it is either in the S conformation (the hydroxyl group is pointing up), and it is referred to as the alpha carbon, or it is in the R conformation (the hydroxyl group is pointing down) and it is referred to as the B configuration. These two conformations are diastereomers, not enantiomers, and the α and β forms are called anomers.

A reducing sugar is one that can react because they have a relatively reactive hemiacetal group at C-1 position. Examples include: glucose, fructose, lactose, and maltose. The anomeric carbon in all of these molecules is free to react.

A non-reducing sugar is one that does not react, such as sucrose. The acetal group at the C-1 position makes the sugar non-reactive. Their structures are modified, so that they do not have free aldehyde or ketone groups to react. In sucrose, neither of the monosaccharides in the disaccharide can easily change into an aldehyde or ketone, making it nonreactive, this non-reducing. The glycosidic bond in the disaccharide hinders the molecule from being reactive. The anomeric carbon is not free to react. In order to determine whether or not a sugar is reducing, a Fehling's or Tollen's test is performed. In the Fehling's test a brick red precipitate is the positive result, and in the Tollen's test a silver mirror is the positive result.

In contrast, when a sugar is oxidized, the aldehyde or ketone carbonyl becomes a carboxyl group.

It is called an O-glycosic bond if the anomeric carbon is attached to an oxygen atom of a hydroxyl group. It is called an N-glycosidic bond if the anomeric bond is attached to a nitrogen atom of a amine group.

Glycosidic bonds are also what form the bridges between monosaccharides. If monosaccharides are joined by O-glycosidic bonds, they are called oligosaccharides.

The difference in having an -OH group attached to Carbon 2' of the sugar is the difference between DNA and RNA. In RNA, the carbon 2' contains an -OH group, whereas in the carbon 2' of DNA, there is just a hydrogen attached. The sugar in RNA, or "ribonucleic acid" is "ribose" while the sugar for DNA or "DEOXIribonucleic acid" is "deoxiribose." DEOXI- is used to represent the lack of oxygen from the -OH group on Carbon 2' of ribose. | || 

Importance of sugar in glycoproteins

CellMembraneDrawing. This is three dimensional structure of a cell membrane that depicts the relationship between sugar and proteins like glycoproteins

Sugar attached proteins called glycoproteins is another important component of the cell. Sugar components are oriented toward the watery cell exterior of glycoproteins. These sugar components serve as an identifier like cellular address labels. When signaling molecules pass through bodily fluids they encounter certain patterns of sugars, which either gives them access or dismissal. Therefore, the glyoproteins act as a regulator or gatekeeper in cells. In addition they help direct the formation of organs and tissue by forming correct cells together. Sugar coatings also help cells move through blood vessels by providing traction by latching on cell surface receptors.



Davis, Alison. "The Chemistry of Health." 'NIGMS August 2006: 36-42.

Structural Biochemistry/Nucleic Acid/Sugars/Deoxyribose sugar

Ribose primarily occurs as D-ribose. It is an aldopentose, a monosaccharide containing five carbon atoms that has an aldehyde


functional group at one end. Typically, this species exists in the cyclic form. Ribose composes the backbone for RNA and relates to deoxyribose, as found in DNA, by removal of the hydroxy group on the 2' Carbon.

Ribose is less resistant to hydrolysis and will cause tension in RNA due to the negative charge of the phosphodiester bridge and the hydroxyl group on the 2' Carbon. The hydroxyl group has the capability to attack the phosphodiesr bond that typically links it to another ribose, thereby forming a cyclic form of the sugar. An example of this is cyclic Adenosine Monophosphate (cAMP).

Roles of D-ribose in the body


Aside of being the backbone for RNA and DNA, D-ribose is also important in the creation of ATP that all cells require to stay alive. It is currently used in medicinal practice to increase muscle energy and improve exercise performance. People that experiences Fibromyalglia and chronic fatigue syndrome that took a supplement of D-ribose improved their conditions dramatically. D-ribose supplements improved their conditions because it helps the patients produce more ATP in the body, because their body cannot produce a sufficient amount of ATP needed.

D-ribose has an important role in improving heart function for patients that suffer symptoms of congestive heart failure (CHF). Ischaemia, which is sudden decrease of blood supply, reduces myocardial ATP level. The addition of D-ribose will replenish the ATP level because it shortens the time it takes to create and restore ATP levels. Therefore the patient will be able to last longer during exercising before experiencing left chest pain, because the body is getting adequate amount of myocardial ATP. It also aided in regulating blood circulation in the heart by normalizing and readjusting blood flow through the left ventricle and atrium to accommodate the sudden change in blood supply. As a result patients suffering from CHF has an improved quality of life after taking D-ribose supplements because they are able to do more physical activity and return to a near normal lifestyle.

D-Ribose supplement is also important to athletes as well because it quickly replenishes ATP levels in muscle to help increase stamina and aid in strength building. D-ribose shorten the time it takes to create ATP because it directly enter the pentose phosphate pathway to create ribose-5-phosphate without having to go through the glucose-6-phospohate dehydrogenase and 6-phosphogluconate dehydrogenase, both of which require rate-limiting enzymes to form. The rate-limiting enzyme will slow down the creation of ATP, therefore by bypassing those pathways ATP will be produced at a higher rate. Hence, it restores ATP that was loss during exercise faster.

Summary of the roles:
1. Provide a backbone for DNA and RNA
2. Restores ATP in the body
3. Improve muscle stamina
4. Regulate blood circulation in the heart.

Natural sources of D-Ribose


D-ribose is a molecule that is naturally produced by the human body and is not found in food sources. However riboflavin, a component of d-ribose that helps aid in the production of d-ribose, is found in a plethora of food. Riboflavin, also known as vitamin B2 is found in found in eggs, milk products, nuts, vegetable, beef, and other proteins. However, these should be kept in areas where it is dimly lit because light can damage riboflavin.



Aside from helping form d-ribose, riboflavin also helps fight off free radicals that can be damaging to cell. Hence it is also a form of antioxidant for the body. Free radicals can damage cells and increase aging and contribute to health conditions, such as heart disease and cancer, therefore riboflavin aids in the reduction of free radicals found in one’s body. Another function of riboflavin is that it helps produce red blood cell and convert B6 vitamin into a form the body can use. Another function of riboflavin is that it helps skin develop properly.

Summary of roles:
1. Helps form ribose that is then converted to d-ribose
2. Acts as an antioxidants
3. Helps produce red blood cells.
4. Convert B6 vitamin into a form the body can use.
5. Helps develop skin properly.






A DNA nucleotide is composed of 3 main units: a 5-carbon monosaccharide (deoxyribose), a phosphate group, and a nitrogenous base. While the monosaccharide and phosphate group alternate in sequence and form the backbone of the DNA double helix, the nitrogenous bases may differ in every adjoining nucleotide. The four nitrogenous bases present in DNA are adenine (A), guanine (G), cytosine (C) and thymine (T). In RNA, the only differing nitrogenous base is uracil (U) (which replaces thymine in DNA and differs thymine only by the missing methyl group at carbon 5 of the pyrimidine ring). Of the nitrogenous bases, adenine and guanine are purines, which are aromatic compounds attached to an imidazole group, while cytosine and thymine and uracil compose a set of pyrimidines, which are one ring-aromatic compounds. Nitrogenous bases, being hydrophobic, tend to face inwards of the double helix, pointing away from the surrounding aqueous environment. If the phosphate backbones were faced inside of the double helix, then there will be too many charges clustered together such that the double helix would be an unlikely product. Bonds between linking nitrogenous bases of two DNA strands are Hydrogen bonds with 3 H-bonds connecting cytosine and guanine and 2 H-bonds connecting adenine and thymine, while the bonds between the stacking of DNA are kept in close contact via van der waals interactions. The aromaticity of the nitrogenous bases accounts for the DNA absorbance peak at 260nm.

== What is a Purine? ==


The name was invented by the German chemist Emil Fischer in 1884. A purine is a nucleotide (a nucleoside + phosphate group) that is amine based and planar, aromatic, and heterocyclic. The structure of purine is that of a cyclohexane(pyrimidine group) and cyclopentane(imidazole group) attached to one another; the Nitrogen atoms are at positions 1,3,7,9. Adenine(A) and Guanine(G) are examples of purines which are involved in the construction of the backbone of the DNA and RNA. They are also a part of the structures for Adenosine disphosphate (ADP), triphosphate(ATP), and other enzymes. Purines form bonds with pentoses exclusively through the 9th Nitrogen atom.

Purine. Two of the bases found in both DNA and RNA, adenine (A) and guanine (G), are derivatives of purine.

6-amino and 2-amino-6-oxy purine


One derivative form of purine, adenine (A), is also commonly known as 6-amino purine. The 6-amino purine molecule contains an amine group attached to the carbon atom at position 6 double bonded to the nitrogen atom at position 1 and single-bonded the carbon atom at position 5. Another derivative form of purine, guanine (G), is also known as 2-amino-6-oxy purine. The 2-amino-6-oxy purine contains an amine group attached to the carbon atom at position 2 double bonded to the nitrogen atom on position 3 and single-bonded to the nitrogen atom on position 1. Guanine also has a carbonyl group at position 6 hence the 6-oxy.

6-amino purine; Adenine
2-amino-6-oxy purine; Guanine. Arrows indicate direction of hydrogen bonding.

Purine content in foods


Food is responsible for approximately 30% of uric acid in the blood. Regular diets could affect the level of uric acid. Some food will increase the blood acidity even if the content in purine is low.

Lowest level of Purine: 0–50 mg


tea, coffee, soda, nuts, dairy products, vegetables, cereal, fruits, preserve foods, sweets

Moderate level of purine: 50–150 mg


spinach, avocado, beef, turkey, lamb, oyster, fish, peanuts, sausages, ducks, chickens

High level of purine: 150–1000 mg


kidney, liver, heart, caviar, scallops, lobster, sardines, Thai fish sauce



A diet high in purines can lead to gout, a form of arthritis with symptoms of severe pain, redness, and swelling. Uric acid is a product formed from the breakdown of purines. Uric acid builds up in one's joints, causing the inflammation and resultant pain.

2 Types of Purine Disorders of Nucleotide Synthesis


Adenylosuccinase deficiency


This causes retardation or heart attacks due to high level of succinyladenosine in urine. Currently, there is no treatment.

Phosphoribosylpyrophosphate synthetase superactivity


A recessive disorder which causes too much production of purines, which results in gout or other developmental effects. Treatments could include low purines in daily diet.



Purines are biochemically significant in a myriad of biomolecules besides DNA and RNA, such as ATP, GTP, cyclic AMP, NADH, and coenzyme A. Although purine has not been found naturally in nature, it can be produced through organic synthesis. Purines can also be used as neurotransmitters, acting upon purinergic receptors (i.e., adenosine activates adenosine receptors)



Many organisms utilize metabolic pathways in order to synthesize and break down purines. Biologically, purines are synthesized as nucleosides, which are bases attached to ribose.

Laboratory Synthesis


Purines can be created artificially, too, and not just through vivo synthesis in purine metabolism. When formamide is heated in an open vessel at 170°C for 28 hours, purine is obtained.



1. Obtain a sample of formamide
2. Heat in an open vessel with a condenser for 28 hours in an oil bath at 170-190°C
3. Remove excess formamide through vacuum distillation
4. Reflux the residue with methanol
5. Filter the methanol solvent and remove by vacuum distillation


Adenine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to thymine (in DNA) and uracil (in RNA).

Structure & Function


Adenine(A) is one of the four bases that make up nucleic acids. It is a purine base that complementarily binds to Thymine (T) in DNA and Uracil (U) in RNA. This bond is formed by two hydrogen bonds, which help stabilize the nucleic acid structures. Different structures of adenine mainly result from tautomerization of adenine, which allows the molecule to be available in isomeric forms in chemical equilibrium. The molecular formula of adenine is C5H5N5 .

An adenine molecule bound to a deoxyribose, a sugar, is known as deoxyadenosine. An adenine bound to ribose, also a sugar, is known as adenosine, a key component in Adenosine Triphosphate. When adenosine attaches to three phosphate groups, a nucleotide, adenosine triphosphate (ATP) is formed. Adenosine triphosphate is an important source of energy that is used in many cellular mechanisms, primarily in the transfer of energy in chemical reactions. The phosphate of ATP can detach, resulting in a release of energy.

In addition to ATP, adenosine also plays a key role in other organic molecules nicotinamide adenine dinucleotide (NAD) and flavin adenine dinucleotide (FAD), both molecules of which are involved in metabolism. Also, adenine can be found in tea, vitamin B12, and several other coenzymes.

Formation and other forms of Adenine


In the human body, adenine is synthesized in the liver. Biological systems tend to preserve energy, so usually adenine is achieved through the diet, the body degrading nucleic acid chains to obtain individual bases and reconstructing them through mitosis. The vitamin folic acid is important for adenine synthesis.

Adenine forms adenosine, a nucleoside, when attached to ribose, and deoxyadenosine when attached todeoxyribose; it forms adenosine triphosphate (ATP), a nucleotide, when three phosphate groups are added to adenosine. Adenosine triphosphate is used in cellular metabolism as one of the basic methods of transferring chemical energy between reactions.

In older literature, adenine was sometimes called Vitamin B4. However it is no longer considered a true vitamin (see Vitamin B). Some think that, at the origin of life on Earth, the first adenine was formed by the polymerizing of 5 hydrogen cyanide (HCN) molecules.



Adenine is one of the byproducts of the Purine metabolism, where inosine monophosphate (IMP) is synthesized with a pre-existing ribose through a complex process involving atoms from the amino acids glycine, glutamine, and aspartic acid, in addition to the formate ions transferred from coenzyme tetrahydrofolate.

]== Tautomerization == Tautomers are isomers related by changing the positions of attachment of a single hydrogen and a single double bond, in a three-atom system, such as the keto- and enol tautomers of a ketone. Like, keto-enol tautomers, Adenine, as well as Cytosine, Guanine, Tyrosine, and Uracil may go through tautomerization, interchanging from the amino to the imino functionality by intermolecular proton transfer.

  File:Http:// Uracil File:Http:// Cystein File:Http:// Guanine File:Http:// Thymine




Guanine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to cytosine.

Guanine is among the five nucleobases that is found in DNA and RNA. The formula of guanine is C5H5N5O, and is a planar and bicyclic molecule. Guanine has two forms, keto and enol forms. The keto form is the major form. Guanine, like adenine, is a derivative of purine and binds to cytosine through 3 hydrogen bonds. The amino group in the cytosine is the hydrogen donor and the C2 carbonyl and the N3 amine are the hydrogen-bond acceptors. In Guanine, the group at C6 acts as the hydrogen accepter, and the group at N1 and the amino group at C2 act as the hydrogen donors. The related nucleoside containing guanine and ribose is called guanosine and guanine bound to deoxyribose sugar is called deoxyguanosine.

Guanine is capable of being hydrolyzed by strong acids to form ammonia, carbon monoxide, carbon dioxide, and glycine. Guanine oxidizes more readily than adenine, another purine-derivative nitrogenous base in nucleic acids. Guanine has a high melting point of 350°C due to the intermolecular hydrogen bonds between the oxo and amino groups in the crystal of the molecule. Also because of this intermolecular bonding, guanine is relatively insoluble in water as well as in weak acids and bases.

DNA base pair bonding


From the image on the left, it can be seen that Guanine and Cytosine bond together through noncovalent hydrogen bonding at three distinct sites. Since Cytosin to Guanine has 3 H-bonds and Adenine to Thymine has 2 H-bonds, a higher CG content leads to higher melting point when compare with AT content. An interesting note is that Watson and Crick first hypothesized that Guanine and Cytosine bonded together through hydrogen bonding at two distinct sites. [1]



Guanine may go through tautomerization, interchanging from the keto to the enol functionality by intermolecular proton transfer.



Guanine is also the name of the white amorphous substance found in fish scales. It serves as an additive to various products such as shampoos, metallic paints, and simulated pearls and plastics providing a pearly iridescent effect. Also, it adds a shimmering luster to eye shadow and nail polish. This pearly luster is produced by the crystalline form of guanine which are rhombic platelets composed of multiple transparent layers that have a high index of refraction that partially reflects and transmits light from layer to layer. To provide this effect, it can be applied by spraying, painting, or dipping.


  1. Crick, Francis H. (1953). "Molecular Structure of Nucleic Acids". Nature. 171: pp. 737-738. {{cite journal}}: |pages= has extra text (help); Unknown parameter |month= ignored (help)

Berg, Jeremy M. John L. Tymoczko. Lubert Stryer. Biochemistry Sixth Edition. New York: W.H. Freeman and Company, 2007.

Purine is a heterocyclic aromatic organic compound. Purine consists of a pyrimidine ring fused to an imidazole ring. Purines and pyrimidines make up of two groups of nitrogenous bases. The name was invented by the German chemist Emil Fischer in 1884. Below are the DNA bases.

DNA Bases



Hypoxanthine (6-Hydroxypurine) is a naturally occurring purine derivative and deaminated form of adenine. It is an intermediate in the purine catabolism reaction and is occasionally found as a constituent in the anticodon of tRNA as the nucleosidic base inosine. It is also utilized as a nitrogen source in bacteria and parasite cultures for energy metabolism and nucleic acid synthesis.




Hypoxanthine exists as an intermediate in the biodegradation of AMP (adenosine monophosphate). It is first converted to xanthine with xanthine oxidase before it is excreted as urate.


A deleterious reaction that can occur is a spontaneous deamination of adenine to form hypoxanthine. This is a mutagenic process because the result is a pairing of hypoxanthine with cytosine rather than thymine, due to hypoxanthine’s guanine-like form. This could lead to an error in DNA transcription and replication.




Berg, et al. Biochemistry, 6th Ed. 2007.



Xanthine is a purine base that's an antecedent of uric acid and is generally found in muscle tissue, blood, urine and some plants. It is a water insoluble toxic yellowish white powder and acids that's soluble in caustic soda; it sublimes when heated. It is involved in purine degradation and is converted from hypoxanthine and converted to uric acid by xanthine oxidase. Some of its derivatives are widely known as mild stimulants, which include caffeine, a sleep-inhibiting methylated xanthine found in coffee, and theobromine, a bitter alkaloid found in cacao.




There is a genetic disease of xanthine metabolism, xanthinuria, due to deficiency of an enzyme, xanthine oxidase. Xanthinuria is a rare genetic disorder where individuals are unable to convert xanthine into uric acid because of the lack of enzyme xanthine oxidase resulting in an accumulation of xanthine. Symptoms include renal failure and kidney stones. There is currently no treatment available to cure this disease.

Clinical Use


Xanthine derivatives are collectively known as xanthines, which are a group of alkaloids used as stimulants and bronchodilators. As a result of widespread side effects, many of these derivatives have been treated as second-rate asthma treatment medication.



Berg, et al. Biochemistry, 6th Ed. 2007.



  Theobromine (xantheose) is a xanthine derivative and bitter alkaloid commonly found in cacao plants. Its name is derived from the name of the genus of the cacao tree. It doesn’t contain bromine, as its name might indicate. It shares a similar structure to that of another well-known purine and xanthine derivative known as caffeine, except it contains one more methyl group. It was first discovered in the cacao plant in 1841, isolated in 1878, and synthesized from xanthine by Hermann Emil Fischer shortly thereafter. In its pure form, it is a water-insoluble, crystalline white powder that has a milder effect than caffeine. Since dark chocolate has higher concentrations of theobromine than milk chocolate, its beneficial effects are better attained from the less diluted dark chocolate.

Therapeutic uses


Theobromine is known as a diuretic, which promotes the removal of excess fluids accumulated in the body from edema, or the flushing of excess salts through the increase production of urine.

It is also widely used as a vasodilator, which widens blood vessels and improves blood flow. This, in turn, helps reduce blood pressure, although it is reputed that flavanols have a bigger role in promoting that effect.

A 2004 patent on the future use of theobromine for cancer prevention was granted due to recent research that revealed anti-carcinogenic activity.





Theobromine has a weaker effect on the human central nervous system than caffeine because of its weaker inhibition effects on cyclic nucleotide phosphodiesterases and its antagonism of adenosine receptors. As for its effect on the heart, theobromine stimulates it to a much greater degree than caffeine. It is cited as being involved in contributing to chocolate’s role as an aphrodisiac.

Since theobromine is a myocardial stimulator, it increases the heartbeat. As stated above it also dilates blood vessels and reduces blood pressure by enlarging the vessels. It is possible that theobromine might be able to treat cardiac failure since it has properties which allowing draining. Ingesting too much theobromine could lead to some adverse effects. Since it is a diuretic, it will increase the amount of urine produced in the person. It could also possible cause nausea, restlessness, sleeplessness, and anxiety.



A helpful hint in responsible pet-keeping is to not feed dogs or cats cacao containing products. This is because they metabolize theobromine much more slowly than humans. Complications that arise from doing such an action is succumbing your pet to theobromine poisoning, which causes digestive issues, dehydration, excitability, and a slow heart rate. Larger quantities of theobromine can result in epileptic-like seizures and even death.

What is a Pyrimidine?


A pyrimidine is a 6-membered heterocyclic organic compound made up of 4 carbon atoms and 2 nitrogen atoms at positions 1 and 3.[1] It is one of three isomers of diazine, the other two being pyridazine (1,2-diazine), and pyrazine (1,4-diazine).[2] Pyrimidines are aromatic and planar. The nucleobases Cytosine(C), Uracil(U), and Thymine(T) are all examples of pyrimidines; each with different chemical groups. Pyrimidines can attach to a phosphate sugar group such as a ribonucleotide(which have a hydroxy group positioned axially at carbon-2) or deoxyribonucleotide(which have a hydrogen atom at C-2) through a glycosidic linkage at the 1st Nitrogen to form a nucleotide, the monomeric building block of nucleic acids (DNA and RNA).

Pyrimidine. Two of the bases found in DNA, cytosine (C) and thymine (T), and a base found only in RNA, uracil (U), are derivatives of pyrimidine.

Correct mistake: 2. It needs carbonyl phosphate synthetase, which is located in the cytoplasm.

Pyrimidine Biosynthesis


1. Unlike in purine, the ring is synthesized first then conjugated after.

2. It needs carbamoyl phosphate synthetase, which is located in the cytoplasm.

3. It also needs an enzyme in order for the reaction to work, but the enzyme should be controlled in 2 steps:

  • controlled level at where the reaction occurs & transcriptions must be reduced
  • the pyrimidine nucleotides which produces the feedback inhibition level also must be controlled

4. The ring then closes.

5. The C-C bond is formed when the ring oxidizes.

Thymine base
Cytosine base
Uracil base

Chemical Properties


Pyrimidine has similar properties to that of pyridines. One similarity is that as the number of nitrogen atoms in the ring increase, the ring pi electrons become less energetic and, as a result, electrophilic aromatic substitution gets more difficult while nucleophilic aromatic substitution gets easier. One example is the displacement of the amino group in 2-aminopyrimidine by chlorine and its reverse reaction. Reduction in resonance stabilization of pyrimidines leads to the addition and ring cleavage reactions, and not substitutions. An example of this is in the Dimroth arrangement. Pyrimidines are less basic than pyridines and the N-alkylation and N-oxidation are more difficult in pyrimidines as well.

Cytosine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to guanine.



Cytosine is part of the pyrimidine family, and it is one of the 5 nucleotide bases found in both DNA and RNA. The molecular formula of cytosine is C4H5N3O. Cytosine consists of a heterocyclic aromatic ring, an amine group at C4, and a keto group at C2. Cytosine binds with ribose to form the nucleoside cytidine and with deoxyribose to form deoxycytidine.

Nucleoside: Cytidine

The molecule is of planar geometry and cytosine forms 3 hydrogen bonds with Guanine in the DNA double helix. The nucleoside of cytosine is cytidine in RNA, which consists of cytosine and ribose. In DNA, it is called deoxycytidine, which consists of cytosine and deoxyribose. The nucleotide of cytosine in DNA is deoxycytidylate which consists of a cytosine, ribose and phosphate.



In 1894, Cytosine was discovered by the hydrolysis of the calf thymus tissue. The first structure for cytosine was published in 1903 and the structure was validated when it was synthesized that same year.(The Columbia Encyclopedia)

Chemical Activity


From the image on the left, it can be seen that Guanine and Cytosine bond together through noncovalent hydrogen bonding at three distinct sites. An interesting note is that Watson and Crick first hypothesized that Guanine and Cytosine bonded together through hydrogen bonding at two distinct sites. [3]

Cytosine is found in DNA and RNA or as a part of a nucleotide. When the nucleoside cytidine binds with three phosphate groups, it forms cytidine triphosphate (CTP). This molecule can act as a co-factor to enzymes and it aids in transferring a phosphate to convert adenosine diphosphate (ADP) to adenosine triphosphate (ATP) to prepare the ATP to be used in chemical reaction.

In DNA and RNA, cytosine binds with guanine through 3 hydrogen bonds. However, this unit is unstable and can change into uracil. This process is called spontaneous deamination. This can possibly lead to a point mutation if DNA repair enzymes such as uracil glycosylase does not repair it by cleaving uracil in DNA.



Cytosine may go through tautomerization, interchanging from the amino to the imino functionality by intermolecular proton transfer.


  3. Crick, Francis H. (1953). "Molecular Structure of Nucleic Acids". Nature. 171: pp. 737-738. {{cite journal}}: |pages= has extra text (help); Unknown parameter |month= ignored (help)

Berg, Jeremy M. John L. Tymoczko. Lubert Stryer. Biochemistry Sixth Edition. New York: W.H. Freeman and Company, 2007.

CYTOSINE. The Columbia Encyclopedia, Sixth Edition

Uracil base. Present only in RNA, the N1 of the molecule bonds to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to adenine.



Uracil is among the five nucleobases: adenine, guanine, cytosine, and thymine,but is only found in RNA. It is a naturally occurring pyrimidine derivative with the molecular formula C4H4N2O2. Uracil is planar and unsaturated and has the ability to absorb light.



Uracil is found in RNA and binds to adenine via 2 hydrogen bonds, but is replaced by thymine in DNA. Methylation of Uracil produces thymine. Uracil can pair with any of the base pairs depending on arrangement. Despite this, it readily pairs with adenine because the methyl group is repelled into a fixed position. In the uracil and adenine bond, uracil is the hydrogen bond acceptor and the adenine is the donor. When attached to a ribose sugar, the compound is called uridine, a nucleoside. Then, phosphate attaches to uridine to form uridine 5'-monophosphate. Nucleotides are formed through a series of phosphoribosyltransferase reactions. This produces substrates, aspartate, carbon dioxide, and ammonia.

Uracil tautomerization: lactam structure (left) and lactim structure (right)

Uracil, like other bases, undergoes tautomerization. The keto tautomer is referred to as the lactam structure, while the imidic acid tautomer is referred to as the lactim structure. With the lactam structure being the major form of uracil, both tauotemric forms are present under conditions where pH=7.

Uracil is a weak acid.

Chemical Activity


Uracil is capable of undergoing reactions such as oxidation, nitration, and alkylation. It can also react with elemental halogens because of the presence of more than one strongly electron donating group. A useful property of uracil is that in the presence of PhOH/NaOCl, it can be visualized in the blue region of UV light.

As stated above, uracil can partake in synthesis, binding with ribose sugars and phosphates to form very useful molecules like uridine, urindine monophosphate (UMP), urindine diphosphate (UDP), urindine triphosphate (UTP).

Ribonucleoside: Uridine



Uracil is a nucleotide that was discovered in the 1900s by the hydrolysis of yeast(Brown 1994). Uracil is an important component in helping enzymes to carry out different reactions and the making of polysaccharides (New World Encyclopedia). Because Uracil helps enzymes carry out different reactions in cells, it is important in the drug industry because it helps with delivering drugs throughout the body. Even though it is useful in helping the delivery of drugs in the body, it can increase the risk of cancer when the body is missing the nutrient folate (The Individualist). Uracil is naturally occurring however, it could also be synthesized in the laboratory by mixing water with cytosine. This reaction will produce two compounds which are uracil and ammonia(Wikipedia).



Uracil may go through tautomerization, interchanging from the keto to the enol functionality by intermolecular proton transfer due to rich electrons ring.



New World Encyclopedia. Uracil. "" 17 November 2008.

Wikipedia. Uracil. "" 17 November 2008.

Brown, D.J. Heterocyclic Compounds: Thy Pyrimidines. Vol 52. New York: Interscience, 1994.

The Individualist. Uracil. "" 17 November 2008.

Thymine base. Present in only DNA, the N1 of the molecule bonds with the sugar within the nucleotide, and the other groups participate in hydrogen bonding to adenine.



5th carbon, hence the other name of thymine, 5-methyluracil. Uracil takes its place in RNA, which also binds to adenine. Thymine is a single ring planar molecule. Thymine combined with deoxyribose yields deoxythymidine while Thymine with ribose makes thymidine.

Thymine binds with deoxyribose to form the nucleoside deoxythymidine, which is the same thing as thymidine. This compound can be phosphorylated with one, two, or three phosphoric acid groups creating thymidine mono-, di-, or triphosphate, respectively.

Nucleoside: Thymidine

Thymine is a part of one of the most common mutations of DNA, which involves two adjacent thymines or cytosines. In the presence of UV light, this may form thymine dimers, causing "kinks" in the DNA molecule, interfering with normal function.

Uses of thymine include cancer treatment where it serves as a target for actions of 5-fluorouracil (5-FU). Substitution of this compound to thymine (in DNA) and uracil (in RNA) allows inhibition of DNA synthesis in actively-dividing cells.



Thymine is a heterocyclic aromatic organic compound as a pyrimidine nucleobase. Heterocyclic compounds are organic compounds (those containingcarbon) that contain a ring structure containing atoms in addition to carbon, such as sulfur, oxygen, or nitrogen, as part of the ring. Aromaticity is a chemical property in which a conjugated ring of unsaturated bonds, lone pairs, or empty orbitals exhibit a stabilization stronger than would be expected by the stabilization of conjugation alone.

As the name implies, thymine may be derived by methylation of uracil at the fifth carbon. In DNA, thymine(T) binds to adenine (A) via two hydrogen bonds to support in stabilizing the nucleic acid structures.

Thymine jointed with deoxyribose creates the nucleoside deoxythymidine, which is identical with the term thymidine. Thymidine can be phosphorylated with one, two, or three phosphoric acid groups, creating TMP, TDP or TTP (thymidine mono- di- or triphosphate) correspondingly.

One of the common mutations of DNA involves two neighboring thymine or cytosine, which in existence of ultraviolet light may form thymine dimers, causing "kinks" in the DNA molecule that constrain normal function.

Thymine could also be a goal for actions of 5-fu in cancer treatment. 5-fu can be a metabolic analog of Thymine (in DNA synthesis) or Uracil (in RNA synthesis). Replacement of this analog inhibits DNA synthesis in actively dividing cells.



Thymine may go through tautaumerization, interchanging from the keto to the enol functionality by intermolecular proton transfer.



Al Mahroos, M., et al. “Effect of sunscreen application on UV-induced thymine dimers.” Arch Dermatol 138: 1480-5, 2002. Ribonucleotide reductase (or RNR) is the enzyme responsible for catalyzing the reduction of ribonucleotides to deoxyribonucleotides. These deoxyribonucleotides can then be utilized by the cell in DNA replication. Additionally, because of the role RNR plays in the formation of deoxyribonucleotides, RNRs are responsible for regulating the rate of DNA synthesis within the cell.[1]

Classes of RNR[2]

  1. Class I: Class I RNRs consist two subgroups (Ia, Ib, and Ic) which differ only slightly in primary structure; however, both subgroups are common in that they contain two different dimeric subunits (R1 and R2) and require oxygen in order to form a stable radical. Class Ic RNRs are the most recently discovered, first found in Chlamydia trachomatis. Evidence also suggests its existence in archaea and eubacteria. The sequence of class Ic RNRs shows that residues in the PCET pathway and active site for nucleotide reductase are similar between the three subgroups.[3]
  2. Class II: Class II RNRs form thiyl radicals with the help of adenosylcobalamin – which fulfills the role of the R2 subunit as a radical generator – and utilize thioredoxin or glutaredoxin as electron donors. Therefore, class II RNRs are made up of only one subunit and present as monomers or dimmers and neither require nor are inhibited by the presence of oxygen.
  3. Class III: Class III RNRs, like Class I RNRs, are made up of two dimeric protein subunits (NrdG and NrdD); however, unlike in Class I RNRs which require R2 continuously to generate radicals, the small NrdG is only required during the activation of NrdD. The mechanism of Class III RNRs uses formate as an electron donor and generates an oxygen-sensitive glycyl radical, thus rendering the enzymes inactive in the presence of oxygen.

Radical Mechanism of RNR


Despite the differences in structure and electron donor, all three classes of RNR proceed via a free radical mechanism.[4] Ultimately RNR catalyzes a reaction which results in the replacement of the 2'-hydroxyl group of the ribose with a hydrogen atom resulting in a deoxyribose moiety.

Metallocofactor Assembly in Class I RNR[5]


Although the Class I RNR’s (Ia, Ib, and Ic) have comparable structures and pathways, the metallocofactors necessarily involved in the activity of RNRs to catalyze the conversion of nucleotides to deoxynucleotides differ remarkably. The mechanisms which generate these cofactors, both in vitro and in vivo, and examining how damaged cofactors are repaired show the significance of each subgroup’s dependence on different cofactors. Studies of the pathways and activation of these metallocofactors have helped our understanding of how biology prevents mismetallation from occurring and configures cluster formation in high yields. All three class I RNR share a common catalytic mechanism in which the metal cofactor is involved directly or indirectly in the oxidation of the conserved cysteine in the active site of alpha to thiol radical S•). Class I RNR oxidation occurs by the Y• in Ia and Ib.

  1. Class IA: Class IA RNR requires a FeIIIFeIII-Y• cofactor. It is localized in β2 at the end of a hydrophobic channel, the supposed access route for O2 cluster assembly. In studies of E. coli, the in vivo process showed that incubation of apo-β2 of E. coli with FeII, O2, and reductant, resulted in self-assembly of the FeIIIFeIII-Y• cofactor. This process likely requires at minimum a single small protein or molecule to deliver FeII to apo-β2 and to deliver the extra reducing equivalent required to reduce O2 to H2O. This is also plausible because Ia RNRN binds MnII more tightly than FeII, thus requiring some type of chaperone protein to ensure proper metallation.
  2. Class IB: Class IB RNR is active with both FeIIIFeIII-Y• and MnIIIMnIII-Y• cofactors. The enzymes can form active FeIIIFeIII-Y• cofactors in vitro, but only the MnIIIMnIII-Y• cofactor was found to be relevant in vivo. The mechanism of this formation has been proposed to occur via oxidation of a MnIIMnII center by a flavoprotein known as NrdI, an oxidant created by reduction of O2. In E.Coli, studies have found that the manganese cofactor is induced when iron is at premature levels in the cell, pointing to the significance of manganese in this and other organisms. There is also an extent of organism-dependent variation in metal homeo-stasis to be considered which may help explain why some organisms rely on either cofactor more frequently.
  3. Class IC: Class IC RNR is unique from Class Ia and Ib RNRs due to its proposed bimetallocofactor, MnIVFeIII. The class Ic RNRs store a one-electron oxidizing equivalent in its metal cluster. In vitro self-assembly of Ic is similar to Ia and Ib in that it reacts with O2 and a reductant to form its respective MnIVFeIII cofactor; however, it differs in that it can also react with 2 equivalents of H2 O2 to form the active cofactor. The class Ic RNR has been isolated from its native organism in vivo, complicating its assembly as the two different metals have similar affinities for the protein. In vitro studies in C. trachomatis have shown the necessity of regulating levels of the metals, along with the order of addition.

There exists problems with proper metal loading within the three subunits of Class I RNR. In the class Ia RNR, it requires a FeIIIFeIII-Y• cofactor, but the protein tends to bind MnII more tightly than FeII. In e.coli, correct metallation of NrdB relies on the necessity of free MnII and FeII present, while iron chaperones are also present to overcome the preference to bind MnII. The issue in class Ib RNR is that it may bind to either FeIIIFeIII-Y• and MnIIIMnIII-Y• cofactors, but only the manganese cofactor was found to be relevant in vivo. Ib binding is dependent on the preference of individual organisms and the concentrations of each metal that they possess inherently. The class Ic RNR complicates metallocofactor assembly since it requires two different metals with similar affinities for the same protein. Regulation of both levels of the metal is important in order to prevent mismetallation and its success depends on the presence of both types of metals. In C. trachomatis, the absence of MnII or at a lower than required rate may lead to diiron cluster formation instead. Thus if these levels are not regulation, low activity and improper metallation occurs. In general, if there is trouble regulating the levels of any of the required metals in each class I RNR, this leads to low activity and improper metallation and ultimately DNA synthesis is affected.

Biosynthesis and Repair of Metal Cofactors in Class I RNR[6]


Certain general principles and challenges exist when studying the metllocofactor formation with different metals and levels of complexity, as summarized below. Physiological expression conditions are taken into account in studies of metalloenzymes to confirm if the form of protein studied in vitro is the same as its active form in vivo. Class I RNRs can control the concentration of the active metal cofactors through biosynthetic and repeair pathways.

  1. Cofactors of metal proteins are generated by specific biosynthetic pathways.
  2. The proteins involved in the biosynthetic pathway are often associated with the operon of the metalloprotein of interest, and certain factors can be analyzed by comparing genomic sequences.
  3. To facilitate the exchange of ligands and protein factors, metals are transferred in their reduced state.
  4. There exists a variety of protein factors which include: metal insertase or chaperone to deliver the metal to the active site, specific redox proteins which control the oxidation state of the metal, and GTPases or ATPases which aid in the folding and unfolding processes to allow the metal to be inserted in the active site.
  5. Due to biological redundancy that affect pathway factors, multiple deletions of genes are required in order to identify phenotypes within a gene deletion experiment.
  6. A hierarchy of metal delivery to proteins and its regulation is inferred but not completely understood.
  7. Compartmentalization (e.g. periplasm vs cytosol in prokaryotes) and affinities of proteins to bind certain metals preferentially are two likely factors that contribute to prevent mismetatallion at the cellular level.
  8. Several proteins have not been isolated from their native source and form heterologous expression systems and leading to mismetallation. Since the optimum level of activity is not fully known, incorrect clusters corresponding to low activity may not be recognized.
  9. Certain oxidants can cause damage to the metal clusters (e.g. NO and O2) and specific pathways are used in their repair.
  10. During changes of oxidaion states, protons are typically required for this metal oxidation. Ligands to metal binding can reorganize easily and rearrangement of the carboxylate ligands are critical to the cluster assembly process.

One of the biggest complications is that the metal required for activity is often not the metal that has the highest affinity for binding to a specific protein. The Irving-Williams series (MnII < FeII < CoII < NiII < CuII > ZnII) best describes the relative affinities of proteins for divalent metals, in addition to the dependence on the particular protein coordination environment where the binding takes place. For the latter metals in the series, chaperone proteins exist to aid their movement to the active sites, while intracellularly they are likely to exist as "free" metals at a low concentration. These chaperone proteins also have another function beside delivery, which is to help maintain low levels of free concentration of these metals to prevent mismetallation and binding between other proteins that require MnII and FeII. Compartmentalization can overcome a protein's binding preference, as certain activities occur in different parts of the cell which have and require varying amounts of a metal. In cyanobacteria, it was found that MnII dependent perisplasmic protein must fold in the cytosol where MnII exists freely in a higher amount than ZuII, CuI, and CuII.

Techniques to Study RNR Activity[7]


There are several techniques used in the laboratory that are used to monitor the activity of the RNR metallocofactors. This contributes to identifying accurate proposed mechanism, generation, and function of these cofactors in vitro and in vivo by studying their movement.

  1. Whole-Cell Electron Paramagnetic Resonance: EPR was used in studying FeIIIFeIII-Y• biosynthesis in S. cerevisae. It was found that Y• levels were sufficiently high and detectable at endogenous levels in various growth conditions, meaning that the Y• is not modulated as a function of the cell cycle. A small molecule or protein factor must be needed to rapidly reduce the Y• in cell lysates, indicating the presence of a metallocofactor which was later identified to be iron.
  2. Mossbauer Spectroscopy: This type of spectroscopy monitors iron movement from oxidized and reduced iron pools into the RNR cofactor. It allows for the detection of all oxidation states of iron simultaneously and is sensitive to the surrounding electronic environments of the iron species present. In order for this technique to be accurate, cells first need to be labelled with the Fe57 isotope.
  1. Herrick J, Sclavi B. (2007) Ribonucleotide reductase and the regulation of DNA replication: an old story and an ancient heritage Mol Microbiol. 63:22–34
  2. Nordlund P, Reichard P (2006). Ribonucleotide Reductases Annu Rev Biochem, 75:681–706
  3. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  4. Eklund H, Eriksson M, Uhlin U, Nordlund P, Logan D (1997). Ribonucleotide reductase--structural studies of a radical enzyme Biol Chem. 378:821–825
  5. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  6. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  7. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767


Nucleotides consist of a base, sugar, and phosphate group. They are the building blocks of nucleic acids. Nucleotides are essential for the body for many reasons. They are needed for gene replication and transcription into RNA. They are also needed for energy. ATP, the body's form of energy, is a nucleotide with adenine as its base. Guanine nucleotides (GTP) are also a source of energy. Furthermore, derivatives of nucleotides are necessary in various biosynthetic processes. Nucleotides are necessary in signal transduction pathways as ewll.

The Biosynthesis of Nucleotides

There are two kinds of pathways in the biosynthesis of nucleotides: de novo and salvage. The following table contains similarities and differences between the two pathways.

De Novo Similarities Salvage
Simpler compounds are used in the synthesis of nucleotides. Numerous small pathways are repeated to assemble different nucleotides. Both synthesize nucleotides, though they utilize different mechanisms. Bases are preformed, recovered, and reconnected to a ribose.
Synthesizes pyrimidine nucleotides. Bicarbonate, aspartate, and glutamine are used to synthesize the ring of the pyrimidine. The ring then links with ribose phosphate, forming the nucleotide. Both assemble ribonucleotides, which are then used to synthezise deoxyribonucleotides for DNA. Synthesizes purine nucleotides. Various precurosrs may be used to form the purine ring, which is then added to ribose and phosphate.

Feedback inhibition regulates multiple steps in the biosynthesis of nucleotides. Examples of this include activation and inactivation of aspartate transcarbamoylase in the synthesis of pyrimidines by CTP and ATP respectively, and activation and inactivation of glutamine-PRPP amidotransferase by purine nucelotides.

Reduction of Ribonucleotides to Deoxyribonucleotides

Ribonucleotide reductase is a catalyst in reducing ribonucleoside diphosphates to deoxyribonucleotides. In this process, electrons flow from NADPH to sulfhydryl groups at ribonucleotide reductase's active sites. The reaction is summarized as follows:

1. An electron is transferred from cysteine on R1 to tyrosyl on R2. This creates a cysteine thiyl radical on R1, which is highly reactive on the active site.
2.A hydrogen from C3 of the ribose is then abstracted. This creates carbon radical.
3. The C3 radical helps release OH- at carbon-2. This departs as H2O after protonation from the second cysteine residue.
4. A third cysteine residue then provides a hydride to complete the reduction at C2. This returns the C3 to a radicala nd also generates a disulfide bond.
5. The c3 radical reacts with the original hydrogen that the first cysteine had extracted. A deoxyribonucleotide has now been generated and can leave the enzyme ribonucleotide reductase.

So What?

The biosynthesis and metabolism of nucleotides are important to the body because disruptions in them can result in pathology. If nucleotides are not degraded properly, certain conditions may arise. An example of this is gout. Urates are degraded proteins, and gout is when they are accumulated, generating poor joints and arthritis.

Similarly, if nucleotides are not synthesize properly, or if not enough are synthesized, conditions will arise as well. An example of this is the Lesch-Nyhan syndrome. Symptoms of this include mental deficiency, self-mutilation, and gout. This disease is due to a lack of an enzyme that is needed to synthesize purine nucleotides through the salvage pathway.

Source: Berg, Jeremy and Stryer, Lubert. Biochemistry: Fifth Edition. United States of America: W.H. Freeman and Company, 2002.

DNA and RNA Backbone


In macromolecules, such as DNA and RNA, there are linear polymers built and connected together by monomers. These monomers are known as nucleotides, and they consist of a nitrogenous base, a sugar, and a phosphate group. The chains and bonds between these nucleotides form the backbone of DNA and RNA, and these backbones allow the formation of unique genetic sequences. In DNA and RNA backbones, the monomers are connected by phosphodiester bridges. Specifically, the bridges are formed between the 3'-hydroxyl group of either the ribose sugar in RNA or deoxyribose sugar in DNA, and the 5'-hydroxyl group of the adjacent sugar; essentially called a 3'-5' phosphodiester bond. Chemically, to make this bond, the 3'-hydroxyl group of a sugar undergoes esterification with a phosphate group. That phosphate group then gets attacked by the 3'-hydroxyl group to form the phosphodiester bridge.

Once the phosphodiester bond is established, the backbone needs to be preserved in order to maintain the genetic information of the nucleotide sequence. Thus, no more nucleophilic attacks may occur on the backbone. In order to prevent nucleophilic attacks, the phosphate group on the phosphodiester bond has a negative charge which is used to prevent other nucleophilic species such as hydroxyl groups from attacking. The fact that DNA lacks a hydroxyl group on the 2' carbon means that it is more resistant to nucleophilic attacks, and thus, is the more stable hereditary material than RNA is.



Phosphodiester Linkage in DNA

What is DNA? DNA is a long chain of linear polymers containing deoxyribose sugars and their covalently bonded bases known as nucleic acids. One of the major functions of the DNA is storage of the genetic information. In DNA a sequence of three bases, which is called a codon, is responsible for the encoding of a single amino acid. The amino acid is added to a growing protein during the process of translation. These nucleic acid polymers encode for the all of the materials an organism needs to live in the form of genes. Genes are small blocks of DNA that tell the cell which proteins it should create. The type of genes that a given cell receives depends entirely on the parent cells. Genes are passed on from generation to generation as a way of ensuring an organism's survival genetically.

DNA stands for deoxyribonucleic Acid. The prefix "deoxy" distinguishes DNA from its close relative RNA (ribonucleic acid). The prefix indicates that, unlike Ribose, Deoxyribose does not contain a hydroxyl group at the 2' carbon replacing it with a single Hydrogen atom. The absence of this Hydroxyl group is fundamental in determining the way in which DNA is able to condense itself within the nucleus of a cell.

DNA is a nucleic acid which is capable of duplicating itself via the enzyme known as DNA polymerase. Each of the four bases on DNA, Adenine (A), Cytosine (C), Guanine (G), and Thymine (T) is bonded covalently to a deoxyribose sugar. The four nucleotide units in DNA are called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate. The nucleotide includes the nucleoside, a nitrogenous base bonded to a deoxyribose or ribose group. The four nucleosides in DNA are deoxyadenosine, deoxyguanosine, deoxycytidine, and thymide. By the joining one or more phosphate groups to a nucleoside through ester linkages, a nucleotide is formed.

The deoxyribose sugars form the structural backbone for DNA via a phosphodiester bond between the 3' carbon of one nucleotide and the 5' carbon of the next. When DNA is not self-replicating it exists in the cell as a double stranded helical molecule with the strands lined up anti-parallel to each other. That is to say if the orientation of one strand is 3' to 5' the other strand would be oriented 5' to 3'. The bases of each strand bind very specifically, A binds with T and C binds with G no other combination exists at least in DNA. The bases are bound to one another internally via hydrogen bonds with the phosphodiester bond backbone oriented to face outward. It is here that the missing 2' hydroxyl group plays an important role in DNA. It is the absence of this group that allows DNA to form its conventional double helix structure. RNA which does have a hydroxyl group at the 2' carbon is unable to obtain this same helical structure. The modern double helix structure of DNA was first proposed by Watson and Crick, and the functions of DNA were demonstrated in a series of experiments which will be discussed in the next few sections.

Why DNA? It is significant to note the reasons why DNA is the primary method through which all cells pass along genetic information. That is to say why has evolution favored a DNA world over an RNA world given that the two molecules are so similar structurally? These reasons involve chemical stability, energy needed to form and break chemical bonds, and the availability of enzymes to perform this task. The primary reason involves the relative stability of the two molecules. DNA is more chemically stable than RNA because it lacks the hydroxyl group on the 2' carbon. In RNA there are two possible OH groups that the molecule can form a phosphodiester bond between, which means that RNA is not forced into the same rigid structure as its deoxy counterpart. Additionally the deoxyribose sugar in DNA is much less reactive than the ribose sugar in RNA. Simply put C-H groups are significantly less reactive than C-OH (hydroxyl) groups. This difference also explains why RNA is not very stable in alkaline conditions, and DNA is. The base in alkaline condition does the same thing as the -OH group at the C2 position. Furthermore, double-strand DNA has relatively small grooves where damaging enzymes can't attach, making it more difficult for them to 'attack' the DNA. Double-stranded RNA, on the other hand, has much larger grooves, and therefore, it is more subject to being broken down by enzymes. The connection between the strands of double-stranded DNA is tighter than double-stranded RNA. In other words, it's much easier to unzip double-stranded RNA than it is to unzip double-stranded DNA. Overall, the breakdown and reform of RNA can be carried out faster and requires less energy than the breakdown and reform of DNA. It is essential to the organism's survival and well-being that its genetic material is encoded into something that is more stable and resistant to changes. In addition, the sequence of DNA and its physical conformation seems to play a part in DNA's selection as well. Another point that helps elucidate DNA's prevalence as the primary storage of genetic information is the availability of the enzyme that breaks down DNA. The body actively destroys foreign nucleases, which are enzymes that cleave DNA. This is only one of the many ways DNA is protected against damage. The body can actually recognize foreign DNA and destroy it, while leaving its own DNA intact.

Hyperchromic Effect Another unique feature of DNA in its double stranded form is the hyperchromic effect, which describes the decreasing absorbance of UV electromagnetic radiation of double helix strands as compared to the non-helical conformation of the molecule. The hydrogen bonding between complementary DNA strands as a result of sugar stacking in the helical conformation causes the aromatic rings to become increasingly stable and thus absorb less UV radiation. This ultimately decreases the amount of UV absorption by 40%. As the temperature is increased these hydrogen bonds dissolve and the helical structure begins to unwind. In this unwound form the aromatic rings are free to absorb much more UV radiation.

Properties of DNA

Two hydrogen bonds only
Three hydrogen bonds

1. Consists of 2 strands (anti-parallel and complementary): DNA has two polynucleotide chains that twist around a helical axis in opposite direction.

2. It is made up of deoxyribose sugar, a phosphate backbone on the exterior, and nucleic acid bases in the interior.

3. Bases are perpendicular to the helix axis that separated by 3.4 Angstroms.

4. Strands are held together by hydrogen bonds an other various intermolecular forces that form a double helix. The base pairing involves 2 hydrogen bonds for A - T and 3 hydrogen bonds for C - G -see in images to the right

5. Backbone consists of alternating sugars and phosphates, where phosphodiester linkages form the covalent backbone of the DNA.The direction of DNA goes from 5' phosphate group to 3' hydroxide group.

6. Repeats every 10 bases

7. Weak forces stabilize DNA because of the hydrophobic effects and VanDerWaals.

8. DNA chain is 20 Angstroms wide (2 nm)

9. One nucleotide unit is 3.3 Angstroms long (0.33 nm)

Primary Structure


DNA is made of two polynucleotide chains (strands) which run in opposite directions around the common axis. As a result, DNA has a double helical structure. Each polynucleotide chain of DNA consists of monomer units. A monomer unit consists of three main components that are a sugar, a phosphate, and a nitrogenous base. The sugar used in the DNA monomer unit is deoxyribose (it lacks an oxygen atom on the second Carbon in the furanose ring). There are also four possible nitrogen containing bases which can be used in the monomer unit of the DNA. Those bases are adenine (A), guanine (G), cytosine (C), and thymine (T). Adenine and guanine are purine derivatives, while cytosine and thymine are pyrimidine derivatives. Polymeric chain forms as a result of joining nucleosides (the sugar which is covalently bonded to the nitrogen containing base) through the phosphodiester linkage. Polymeric chain is a single strand of the DNA molecule. Two strands run in opposite directions to form double helix. The forces that keep those strands together are hydrogen bond, hydrophobic interactions, van de Waal force, and charge-charge interactions. The H-bonds form between base pairs of the antiparallel strands. The base in the first strand forms an H-bond only with a specific base in the second strand. Those two bases form a base-pair (H-bond interaction that keeps strands together and form double helical structure). The base –pairs are: adenine-thymine (A-T), cytosine-guanine (C-G). Such interaction gives us the hint that nitrogen-containing bases are located inside of the DNA double helical structure, while sugars and phosphates are located outside of the double helical structure. The hydrophobic bases are inside the double helix of DNA. The bases, located inside the double helix, are stacked one on the top of another. Stacking bases interact with each other through the Van der Waals force. Even though the van de Waal forces are week, sumation of those forces can be substantial. The distance between two neighboring bases that are perpendicular to the main axis is 3.4 A˚. DNA structure is repetitive. There are ten bases per turn, so every base has a 36° angle of rotation. The diameter of the double helix is approximately 20 A˚. The hydrophobic effect stabilizes the double helix. The structural variation in DNA is due to the different deoxyribose conformations, rotation about the contiguous bonds in the phosphodeoxyribose backbone, and free rotation about the C-1'- N (glycosyl bond).

The technique of southern blotting is often used to uncover the DNA sequence of a sample. The technique is named after Edwin Southern.

DNA Manipulation Techniques


When it comes to exploring genes and genomes, it depends on the technical tools that are used. The five important DNA manipulation techniques are:

1.Restriction Endonucleases - also known as restriction enzymes

The restriction of enzymes split the DNA into specific fragments. By having the DNA split into different pieces, it allows the manipulation of DNA segments.

2. Blotting Technique

To separate and characterize DNA, the Southern blotting technique is used. This technique is similar to the Western blot, except that Southern blotting is used for DNA and not RNA. This technique identifies a specific sequence of DNA by electrophoresis through an agarose gel. The DNA is separated by placing the large fragments on top and the small fragments at the bottom. Next, the DNA is transfer into the nitrocellulose sheet. Then a 32-p labeled DNA probe that is complementary to the sequence, is added to hybridize the fragments. Finally, a autoradiography film is use to view the fragment containing the sequence.

3. DNA Sequencing

By using the DNA sequencing technique, a precise nucleotide sequence of a DNA molecule can be determined. The key to DNA sequencing is the generation of DNA fragments whose length depends on the last base of the sequence. Even though there are different alternative methods, they all perform the same procedure on the four reaction mixtures.

A. Chain termination DNA Sequencing

A primer is always needed. To produce fragments, the addition of 2', 3'-dideoxy analog of a dNTP is added to each of the four mixtures. It will stop the sequence at that N-dideoxy. The types of dNTP that can be use are dATP, TTP, dCTP, dGTP. In the end, new DNA strands are separated to electrophoresis.

B. Fluorescence Detection of Bases

Fluorescent tag is used into each of the four chain-terminating dideoxy nucleotides at different wavelengths. It is an effective method because no radioactive reagents are used and large sequences of bases can be determined. The fragments get separated by having the mixture passed through high voltage. Then, the fragments are detected by their fluorescence, which the base sequence is based on the color sequence.

C. Top-down (Shotgun) Method of Genome Sequencing

The top-down method and the shotgun method are similar, the main difference is that the top-down requires a detailed map of the clones. The Shotgun randomly sequences large clones to match them computationally.

D. Microarrays(Green chips)

Using microarrays is useful when it comes to studying the expression of a large number of genes. The microarray is created by using either oligonucleotides or cDNA. Based on the fluorescent intensity, red or green marks will appear. If it is red, it means no fluorescence is present, known as gene induction. If it is green, fluorescence is expressed, known as gene repression.


4. DNA Synthesis

To synthesize DNA, a solid-phase method is used. The solid-phase synthesis is carried out by the phosphite triester method. In this process only one nucleotide is added in each group. The first step that takes place is the binding of the first nucleotide. Another nucleotide is added and activated and reacts with the 3' -phosphoramididte containing DMT. A deoxyribonucleoside 3' -phosphoramidite with DMT and βCE is attached because it has the ability to synthesize any DNA. It is also a basic nucleotide that is modified and protected. Then, the molecule gets oxidized to oxidized the phosphate group. In the end, the DMT is removed by addition of dichloroacetic acid. Overall, the desired product remains insoluble and it is release at the end.

5. Polymerase Chain Reaction (PCR)

PCR is a technique used that allows to amplify DNA sequence between two nucleotides. If the DNA sequence is known, millions of copies of that sequence can be obtained by using this technique. To carried out PCR, a DNA template, a precursor, and two complementary primers are needed. What makes the PCR unique is that the temperature is constantly changing within the three different stages and that the stages get repeated 25 times. The three stages are:

1. Denaturing - DNA gets denature from a double strand (parent DNA molecule) to two single strands by heating thesolution at 94°C.

2. Annealing - After letting the solution cooled, two synthetic oligonucleotide primers are added at the end of the 3' end of target strand, and at the 3' end of complementary strand. This process is done when the temperature is between 50°C - 60°C.

3. Polymerization - Addition of thermostable DNA polymerase to catalyze 5' to 3' DNA synthesis at 72°C.

Structural Variation


Structural Variation occurs due to the different deoxyribose conformations, free rotation about the C-1, and rotation about the closest bond in phosphodeoxyribose backbones.

There are secondary structures when it comes to DNA which are forms A, B, and Z. A Form: 1. Right handed 2. Glycosyl bond conformation is ANTI 3. Needs 11 base pairs per helical turn 4. Size of diameter is about 26 angstroms 5. Sugar pucker conformation is at the C-3' endo.

B Form: 1. Like the A form, the B form is right handed. 2. Glycosyl bond formation is ANTI 3. Needs 10.5 base pairs per helical turn 4. Size of diameter is about 20 angstroms 5. Sugar pucker conformation is at the C-2' endo

Z Form: 1. Unlike the A and B form, the obvious difference is that the Z form is left handed. 2. Glycosyl bond formation consists of two components: pyrimidines and purines. ANTI (for pyrimidines) and SYN (for purines) 3. Needs 12 base pairs per helical turn 4. Size of diameter is about 18 angstroms 5. Sugar pucker conformation is at the C-2' endo (for pyrimidines) and C-3' endo (for purines)

DNA libraries


A DNA library is a collection of cloned DNA fragments in a cloning vector that can be searched for a DNA of interest. If the goal is to isolate particular gene sequences, two types of library are useful.

Genomic DNA libraries


A genomic DNA library is made from the genomic DNA of an organism. For example, a mouse genomic library could be made by digesting mouse nuclear DNA with a restriction nuclease to produce a large number of different DNA fragments but all with identical cohesive ends. The DNA fragments would then be ligated into linearized plasmid vector molecules or into a suitable virus vector. This library would contain all of the nuclear DNA sequences of the mouse and could be searched for any particular mouse gene of interest. Each clone in the library is called a genomic DNA clone. Not every genomic DNA clone would contain a complete gene since in many cases the restriction enzyme will have cut at least once within the gene. Thus some clones will contain only a part of a gene.

cDNA Library


A cDNA library is a library of mRNAs. It is made from introns and exons and a cDNA library is made to be able to isolate the genes/the final version of the gene.
A cDNA library i used to screen for colonies. If looking for a gene, you can screen the colonies, use the collection of plasmids, transform the bacteria, and use a probe. You can also use Southern Hybridization. By using an oligonucleotide that is complementary to the gene you are looking for, and that will eventually tell you which colonies of bacteria will have the DNA that corresponds with the mRNA in the plasmids.
How to make a cDNA library:
1. Isolate mRNA from the cell.
2. Use reverse transcriptase and dNTPss so that from the original mRNA, a DNA copy can be created.
3. RNA is easier to degrade than DNA so put in alkali solution to degrade mRNA.
4. Use DNA polymerase to complete the template.
Ultimately, you end up with double stranded DNA, one of which is identical to the mRNA. After doing this all for mRNA, you can clone it in the plasmids. The collection of plasmids will include all of the mRNA but in the form of DNA.[1]

Flow of Genetic Information

  • Genetic information storage: genome
  • Replication: DNA --> DNA
  • Transcription: DNA --> RNA
  • Translation: RNA --> Proteins


  1. Viadiu, Hector. "Making a cDNA Library." UCSD. Lecture. November 2012.



Berg , Jeremy . Biochemistry . 7. New York : W.H Freeman and Company , 2012. Print.

Berg, Jeremy, Tymoczko J., Stryer, L.(2012). Protein Composition and Structure.Biochemistry(7th Edition). W.H. Freeman and Company. ISBN1-4292-2936-5

Hames, David. Hooper, Nigel. Biochemistry. Third edition. New York. Taylor and Francis Groups. 2005.



Deoxyribonucleic acid (DNA) stores information for the synthesis of specific proteins. DNA has deoxyribose as its sugar. DNA consists of a phosphate group, a sugar, and a nitrogenous base. The structure of DNA is a helical, double-stranded macromolecule with bases projecting into the interior of the molecule. These two strands are always complementary in sequence. One strand serves as a template for the formation of the other during DNA replication, a major source of inheritance. This unique feature of DNA provides a mechanism for the continuity of life. The structure of DNA was found by Rosalind Franklin when she used x-ray crystallography to study the genetic material. The x-ray photo she obtained revealed the physical structure of DNA as a helix.

DNA has a double helix structure. The outer edges are formed by alternating deoxyribose sugar molecules and phosphate groups, which make up the sugar-phosphate backbone. The two strands run in opposite directions, one going in a 3' to 5' direction and the other going in a 5' to 3' direction. The nitrogenous bases are positioned inside the helix structure like "rungs on a ladder," due to the hydrophobic effect, and stabilized by hydrogen bonding.

Nitrogenous base Nucleoside Deoxynucleoside

The two strands run in opposite directions to form the double helix. The strands are held together by hydrogen bonds and hydrophobic interactions. The H-bonds are formed between the base pairs of the anti-parallel strands. The base in the first strand forms a H-bond only with a specific base in the second strand. Those two bases form a base-pair (H-bond interaction that keeps strands together and form double helical structure). The base–pairs in DNA are adenine-thymine (A-T) and cytosine-guanine (C-G). Such interactions provide us an understanding that nitrogen-containing bases are located inside of the DNA double helical structure, while sugars and phosphates are located outside of the double helical structure.

The component consisting of the base and the sugar is known as the nucleoside. DNA contains deoxyadenosine (deoxyribose sugar bonded to adenine), deoxyguanosine (deoxyribose sugar bonded to guanine), deoxycytidine (deoxyribose sugar bonded to cytosine), and deoxythymidine (deoxyribose sugar bonded to thymine). The linkage of the bonds between the base to the sugar is known as the beta-N-Glycosidic linkage. In purines, this occurs between the N-9 and C-1' and in pyrimidines this occurs between the N-1 and C-1'. A nucleoside and a phosphate group make up a nucleotide. The bond between the deoxyribose sugar of the nucleoside and the phosphate group is a 3'-5' phosphodiester linkage.

The bases, located inside the double helix, are stacked. Stacking bases interact with each other through the Van der Waals forces. Although the energy associated with a Van der Waals interaction is relatively small, in a helical structure, a large number of atoms are intertwined in such interactions and the net sum of the energy is quite substantial. The distance between two neighboring bases that are perpendicular to the main axis is 3.4 Å. The DNA structure is repetitive. There are ten bases per turn, that is the structure repeats after 34 Å, so every base has a 36° angle of rotation. The radius of the double helix is approximately 10 Å.

An easy way to differentiate between Nucleosides and Deoxynucleosides is the atoms bonded to C-2 on the sugar unit. If the structure is a deoxynucleoside, then C-2 bears two hydrogens. If it is a nucleoside, then C-2 bears one hydrogen and one hydroxide group, in which the hydroxide group faces south.

Structural variations in DNA can occur if:
1. There are different deoxyribose conformations
2. If there are rotations around the contiguous bonds in the phosphodeoxyribose backbone
3. Free rotation about the C-1'N=glycosyl bond (syn/anti)[1]

Terms and Naming


There are two types of nucleic acids, ribonucleic acids (RNA) and deoxyribonucleic acid (DNA). Recall that a nucleoside is a base + sugar. A Nucleotide is composed of a base + sugar + phosphate. The deoxy- prefix in Deoxyribonucleotides is the nomenclature used for DNA. The term ribonucleotides is employed when it is nomenclature for RNA, or in other words, C-2 on the sugar unit has an -OH group (versus deoxy which C-2 has 2 hydrogens). Symbols are used to simplify the names. For example, ATP (precursor of RNA). The "A" in the front signifies that the base is Adenine and the "T" in the middle signifies tri-phosphates. AMP on the other hand, also has an adenine, but the M signifies that the sugar is bound to a single phosphate group. Finally, in dAMP, the "d" signifies that it is a 2'-deoxyribo-, versus simply AMP means it is a ribonucleotide.In short, four nucleotide units of DNA are called deoxyadenylate, deoxyguanylate, deoxycitidylate, and thymidylate.

Early foundation for DNA structures


The primary structure of a nucleic acid is its covalent structure and nucleotide sequences. One of most important parts of determining the structure of DNA comes from the work of Erwin Chargaff and his colleagues in the late 1940s. They found that the four nucleotide bases of DNA of different organisms and that the amounts of certain bases are closely related. They concluded the following about the structure of DNA:

DNA general structure and its bases

1. The base composition of DNA generally varies from one species to another.

2. DNA specimens isolated from different tissues of the same species have the same base composition.

3. The base composition of DNA in a given species does not change over time, nutritional states, or environment.

4. In all cellular DNA, regardless of the species, the number of adenine residues is equal to the number of thymine residue (A=T) and the number of guanine residues is equal to the number of cytosine residues (G=C).

Later in 1953, Rosalind Franklin and Maurice Wilkins used a powerful X-ray diffraction technique called X-ray crystallography to deduce the DNA structure. Photographs produced by the X-ray crystallography method are not actually pictures of molecules, however the spots and smudges produced by X-rays that were diffracted (deflected) as they passed through crystallized DNA. Crystallographers use mathematical equations to translate such patterns of spots into information about the three-dimensional shape of DNA. Franklin and Wilkins found that DNA molecules are helical with two periodicities along their long axis, a primary one of 3.4 A and a secondary one of 34 A.

A DNA molecule separated and created of new daughter DNA

Watson and Crick later based their model of DNA upon the data they were able to extract from Wilkins and Franklin's X-ray diffraction photo.

They interpreted the pattern of spots on the X-ray photo to mean that DNA consisted of two chains and was helical in shape. Eventually, Watson and Crick formulated a DNA structure from the diffraction pattern of the x-ray photo and gave to incredible insight that is still accepted today. In this structure, they proposed that two helical DNA chains of opposite direction wound around the same axis to form a right handed double helix. The hydrophilic backbones form by phosphodiester bonds of alternating deoxyribose sugar and phosphate group that are faced outside of the helix, surrounded by aqueous environment. The furanose ring of each deoxyribose sugar is in the C-2’ endo conformation. The purine and pyrimidine bases of both strands are stacked inside the double helix and stabilized by Van Der Waals interactions.

The double-helix has a diameter of 10 Å. Each adjacent base on one strand of the double-helix is 3.4 Å apart. Every 10 base-pairs constitutes a 360° turn in the helix, and the length of the helix is determined by 34 Å per 10 base-pairs.

Nucleoside (adenosin) with beta glycosidic bond
DNA strand



DNA molecules are asymmetrical, such property is essential in the processes of DNA replication and transcription. A double-stranded DNA molecule consists of two complementary but disjoint strands that are intertwined into a helix formation through a network of H bonds. Although both the right-handed and left-handed helices are among the allowed conformations, right-handed helices are energetically more favorable due to less steric hindrance between the side chains and the backbone. The direction of DNA is determined by the arrangement of the phosphate and deoxyribose sugar groups along the DNA backbone. One of the DNA ends terminates with the 3'-OH group, whereas the other one terminates with the 5'-phosphate group. All sequences of DNA are usually written from 5' to 3' termini. In a double-helix formation, the complementary DNA strands are oriented in opposite directions. DNA is a rather rigid molecule: at physiological conditions, DNA curves at the length scale of about 50 nm, which is 20 times the diameter of the double helix. More so, the alignment of the bases can indicate the global orientation of a DNA strand. For purine nucleotides (A and G) the most probable angle is approximately 88°, whereas for pyrimidine (C and T) that angle is approximately 105°.


Forces involved in DNA helices


The DNA double helix is held together by two main forces: hydrogen bonds between complementary base pairs inside the helix and the Van der Waals base-stacking interaction.



Hydrogen bonds


Watson and Crick found that the hydrogen bonded base pairs, G with C, A with T, are those that best fit within the DNA structure. It is important to note that three hydrogen bonds can form between G and C, but only two bonds can be found in A and T pairs. On the other hand, A-T pairs seem to destabilize the double helical structures. This conclusion was made possible by a known fact that in each species the G content is equal to that of C content and the T content is equal to that of A content.

Below is the link to the demo of the Hydrogen bondings between base pairs:

The three hydrogen bonds that constitute the linkage of Guanine(G) and Cytosine(C) consequently alters the thermal melting of DNA, which is dependent upon base compositions. With varying base composition the melting point of such molecule will either increase or decrease.

Denaturing and Annealing

Ultraviolet (UV) light can detect whether bases are stacked or unstacked. Stacked bases within the DNA structure facilitate shielding from light, therefore the absorbance of UV light of double helical DNA is much less than single stranded DNA. This characteristic is known as the hypochromic effect, in which less color is emitted from the double helix of DNA molecules.

The melting temperature (Tm) is the temperature in which DNA is half way of the DNA is double stranded and half is single stranded. The Tm depends greatly on base composition. Since G-C base pairs are stronger due to more Hydrogen bonds, DNA with high G-C content will have a higher Tm than that of DNA with greater A-T content.

When heat is applied to a double-stranded DNA, each individual strand will eventually separate (denature) because hydrogen bonds are disrupted between base pairs. Upon separation, the separated strands spontaneously reassociate to form the double helix again. This process is known as annealing.

In biological systems, both denaturing and annealing can occur. Helicases use chemical energy (from ATP) to disrupt the structure of double-stranded nucleic acid molecules. The study of the ability of DNA to reanneal within the laboratory is important in discovering gene structure and expression.

Complex Structures

Complex structures can also be formed from single-stranded DNA. A stem-loop is formed when complementary sequences, within the same strand, pair to form a double helix. Hydrogen bonds between base pairs within the same strand occur. Often, these structures include mismatched bases, resulting in destabilization of the local structure. Such action can be important in higher-order folding, like in tertiary structures.

Hypochromic Effect


DNA absorbs very strongly at wavelengths close to UV light (~260 nm). A single stranded DNA will absorb more UV light than that of double-stranded DNA. DNA UV absorption decreases when it forms a double strand, this characteristic is an indication of DNA stability. With the increase in light energy, its structure and therefore its function will still remain intact since there is low disturbance to its structure.

The decreased absorbance observed with the DNA double helix with respect to the native and denatured forms is explained by the fact that the stacking of the nitrogenous bases that takes place with the double helix does not leave them as exposed to radiation and thus they are able to absorb less. The aromaticity of the nitrogenous bases (specifically in the purine and pyrimidine like ring structures) accounts for the absorption peak being at 260nm.

Weak forces


Various Weak Forces come together to stabilize the DNA structure.

  • Hydrogen bonds, linkage between bases, although weak energy-wise, is able to stabilize the helix because of the large number present in DNA molecule.
  • Stacking interactions, or also known as Van der Waals interactions between bases are weak, but the large amounts of these interactions help to stabilize the overall structure of the helix.
    • Double helix is stabilized by hydrophobic effects by burying the bases in the interior of the helix increases its stability; having the hydrophobic bases clustered in the interior of the helix keeps it away from the surrounding water, whereas the more polar surfaces, hence hydrophilic heads are exposed and interaction with the exterior water
    • Stacked base pairs also attract to one another through Van der Waals forces the energy associated with a single van der Waals interaction has small significant to the overall DNA structure however, the net effect summed over the numerous atom pairs, results in substantial stability.
    • Stacking also favors the conformations of rigid five-membered rings of the sugars of backbone.
  • Charge-Charge Interactions- refers to the electrostatic (ion-ion) repulsion of the negatively charged phosphate is potentially unstable, however the presence of Mg2+ and cationic proteins with abundant Arginine and Lysine residues that stabilizes the double helix.

Nitrogenous Bases


Nitrogenous Bases are the foundational structure of DNA polymers, the structure of DNA polymers vary with the different attached nitrogenous bases.

Nitrogenous Bases can tautomerize between keto and enol forms. The aromaticity of the pyrimidine (Cytosine, Thymine, Uracil (RNA)) and purine (Adenine, Guanine) ring systems and their electron-rich nature of -OH and -NH2 substituents enable them to undergo keto-enol tautomeric shifts. The keto tautomer is called a lactam and the enol tautomer is called lactim. The lactam predominates at pH 7. Keto-enol tautomerization is the interconversion of a keto and enol involving the movement of a proton and the shifting of bonding electrons, hence the isomerism qualifies as tautomerism.


Keto-enol tautomerism is important in DNA structure because high phosphate-transfer potential of phosphenolpyruvate results in the phosphorylated compound to be trapped in the less stable enol form, whereas dephosphorylation results in the keto form. Rare enol tautomers of bases guanine and thymine can lead to mutation because of the altered base-pairing properties.

Base-stacking interactions


The two strands of double-stranded DNA are held together by a number of weak interactions such as hydrogen bonds, stacking interactions, and hydrophobic effects. Of these, the stacking interactions between base pairs are the most significant. The strength of base stacking interactions depends on the bases. It is strongest for stacks of G-C base pairs and weakest for stacks of A-T base pairs. The hydrophobic effect stacks the bases on top of one another. The stacked base pairs attract one another through Van der Waals forces, typically from 2 to 4 kJ/mol-1. In addition, base stacking in DNA is favored by the conformations of the somewhat rigid five membered rings of the backbone phosphate-sugars. The base-stacking interactions, which are largely nonspecific with respect to the identity of the stacked base, make the major contribution to the stability of the double helix.

Phosphodiester Bond

Phosphodiester Bond between nucleotides

Phosphodiester linkages form the covalent backbone of DNA. A phosphodiester bond is the linkage formed between the 3' carbon atom and the 5' carbon of the sugar deoxyribose in DNA.

The phosphate groups in a phosphodiester bond are negatively-charged. The pKa of phosphate groups are near 0, therefore they are negatively-charged at neutral pH (pH=7). This charge-charge repulsion forces the phosphates groups to take opposite positions of the DNA strands and is neutralized by proteins (histones), metal ions such as magnesium, and polyamines.

The tri-phosphate or di-phosphate forms of the nucleotide building are blocks, first have to be broken apart to release the energy require to drive an enzyme-catalyzed reaction for a phosphodiester bond to form and for the nucleotide to join. Once a single phosphate or two phosphates (pyrophosphates) break apart and participate in a catalytic reaction, the phosphodiester bond is formed.

An important role in repairing DNA sequences is due to the hydrolysis of phosphodiester bonds being catalyzed by phoshodiesterases, an enzyme that facilitates the repairs.

One reason that made DNA more stable than RNA is absence of the 2'-OH group in DNA. The presence of OH group on 2'C makes RNA more susceptible for reactions. A nucleophile (base) can pull out the H (when everything is in the correct trajectory) and the phosphate part of the backbone will rearrange and eventually a P-O bond is broken to break the connection site between two sugars.

Secondary Structures of DNA


Major and Minor Grooves

Base pairing of complementary nucleotides make up the secondary structure of DNA. A single-stranded DNA may participate in intramolecular base pairing between complementary base pairs and therefore make up secondary structure as well. Base pairing between Adenine (A)-Thymine (T) and Guanine (G)-Cytosine(C)are possible because these base pairs are similar in size. This means there are no "bulges" or "gaps" within the double helix.

Irregular placement of base pairs in a double helix will result in consequences that will render the macromolecule nonfunctional. Therefore if there is something wrong with the structure, signals will be sent and DNA repair will work to fix damage.

As a result of the double helical nature of DNA, the molecule has two asymmetric grooves. One groove is smaller than the other. This asymmetry is a result of the geometrical configuration of the bonds between the phosphate, sugar, and base groups that forces the base groups to attach at 120 degree angles instead of 180 degree. The larger groove is called the major groove, occurs when the backbones are far apart; while the smaller one is called the minor groove, occurs when they are close together.

Since the major and minor grooves expose the edges of the bases, the grooves can be used to tell the base sequence of a specific DNA molecule. The possibility for such recognition is critical, since proteins must be able to recognize specific DNA sequences on which to bind in order for the proper functions of the body and cell to be carried out. As you might expect, the major groove is more information rich than the minor groove, allowing the DNA proteins to interact with the bases. This fact makes the minor groove less ideal for protein binding.

Visual Representation of Major and Minor Grooves in DNA Structure

A form


These following features represented different characteristics of A-form DNA structure:

1. Most RNA and RNA-DNA duplex in this form

2. Shorter, wider helix than B.

Deep, narrow major groove not easily accessible to proteins

Wide, shallow minor groove accessible to proteins, but lower information content than major groove.

Favored conformation at low water concentrations

Base pairs tilted to helix axis and displaced from axis

Sugar pucker C3'-endo (in RNA 2'-OH inhibits C2'-endo conformation)

Right handed

Size is about 26 angstroms

10· Needs 11 base pairs per helical turn

11· Glycosyl bond conformation is Anti

B form


The double helical structure of normal DNA takes a right-handed form called the B-helix. It is about 20 angstroms with a C-2' endo sugar pucker conformation. The helix makes one complete turn approximately every 10 base pairs (= 34 A per repeat/3.4 A per base). B-DNA has two principal grooves, a wide major groove and a narrow minor groove. Many proteins interact in the space of the major groove, where they make sequence-specific contacts with the bases. In addition, a few proteins are known to make contacts via the minor groove.

B and Z form DNA

Z form


DNA sequences can flip from a B form to a Z form and vice versa. Z form of DNA is a more radical departure from the B structure; the most obvious distinction is the left-handed helical rotation.

The Z form is about 18 angstroms and there are 12 base pairs per helical turn, and the structure appears more slender and elongated. The DNA backbone takes on a zigzag appearance. Certain nucleotide sequences fold into left-handed Z helices much more readily than others. Prominent examples are sequences in whichpyrimidines alternate with purines, especially alternating C and G or 5-methyl-C and G residues. To form the left-handed helix in Z-DNA, the purine residues flip to the syn conformation alternating with pyrimidines in the anti conformation. The major groove is barely apparent in Z-DNA, and the minor groove is narrow and deep. For pyrimidines, the sugar pucker conformation is C-2' endo and for purines, it is a C-3' endo.

Z-DNA formation occurs during transcription of genes, at transcription start sites near promoters of actively transcribed genes. During transcription, the movement of RNA polymerase induces negative supercoiling upstream and positive supercoiling downstream the site of transcription. The negative supercoiling upstream favors Z-DNA formation; a Z-DNA function would be to absorb negative supercoiling. At the end of transcription, topoisomerase relaxes DNA back to B conformation.

Tertiary structure (3 dimensional)


The tertiary structure of DNA molecule is made up of the two strands of DNA wind around each other. DNA double helix can be arranged in space, in a tertiary arrangement of strands.

  • Linking Number( Lk) in a covalently closed circular DNA, where the two strands cannot be separated will result in a constant number of turns in a given molecule. Lk of DNA is an integral composed of two components:
1)Twist (Tw): number of helical turns of DNA strand
2) Writhe (Wr): number of supercoiled turns in DNA

Normally, DNA has Lk of about 25, meaning it is underwound. However, DNA can also be supercoiled with two "underwindings" which is made up of negative supercoils. This is much like the two "turns- worth" of a single stranded DNA and no supercoils. This kinds of interconversion of helical and superhelical turns in important in gene transcription and regulation.

Quaternary structure and other unusual structure


DNA is connected with histones and non-histone proteins to form the chromatin. The negative charge due to the phosphate group in DNA makes it relatively acidic. This negative charge binds to the basic histone groups.

Histone Modification


Recent studies provide that actively transcribed regions are characterized by specific modification pattern of histone. The experiments carried on by the dynamics of histone modification shows that there is a significant kinetic distinction between methylation, phosphorylation, and acetylation. This suggest that the roles of these modifications has different roles in gene expression patterns.

Histones are proteins which DNA wraps around and forms a chromatin. The basic unit of a chromatin is a nucleosome which are formed by histone octomer of 2 molecules of H2A, H2B, H3, and H4 along with 147 base pairs of DNA wrapped in a superhelix. The accessibility of DNA is regulated by higher-order chromatin structures that of which can be obtained by the packing of nucleosomes. It is believed that the N-Termini tail of the histone molecules contributes to the chromatin function in that it mediates inter-nucleosomal interactions and are involved in the recruitment of non-histone proteins to the chromatin. The N-termini tail directs interactions to the chromatin binders which is thought to be the driving force of modulate chromatin structure. However, there are other ways modifications can occur such as that observed by the unfolding or assembly of nucleosome and how it is involved in gene regulation. It is hoped that this can provided an explanation of epigenetic inheritance (Box 1) the there phenotypic differences in individual cannot be due to differences in DNA, such as that of monozygotic twins.

Epigenetic inheritance are changes in the gene activity that are not encoded by the DNA sequence. These changes include phosphorylation, methylation, ADP-ribosylation, SUMOylation, and ubiquitylation. These modifications can be considered active or repressive depending on their occurrence in active or silent genes. It is show that methylation can have different outcomes depending on the binders of the histone modifications. Nucleosome positioning are found to have an influence on the DNA sequence and may contribute to epigenetic inheritance.[2]

Structural Variation in DNA


The Structural Variation in DNA is most due to:

1) Varying deoxyribose conformations (4 total conformations)
2) Rotations about the contiguous bonds in the phosphodeoxyribose backbone (between the C1-C3 and C5-C6)
3) Free rotation about C1'- N-glycosyl bond (resulting in syn or anti conformation)

Because of steric hindrance, purines bases in nucleotides are restricted to two stable conformations with respect to deoxyribose, called syn and anti. On the other hand, pyrimidines are generally restricted to the anti conformation because of steric interference between the sugar and the carbonyl oxygen at C-2 of the pyrimidine.

Comparison of A, B, and Z form of DNA

A form B form Z form
Helical sense
Right handed Right handed Left handed
26 A 20 A 18 A
Base pairs per helical turn
11 10.5 12
Helix rise per base pair
2.6 A 3.4 A 3.7A
Base tilt normal to the helix axis
200 60 70
Sugar pucker conformation
C-3’ endo C-2’ endo C-2’ endo for pyrimidines and C-3’endo for purines
Glycosyl bond conformation
Anti Anti Anti for pyrimidine and syn for purines


  1. Viadiu, Hector. "DNA Structure" UCSD Lecture. November 2011.
  2. Teresa Barth and Axel Imhof. "Fast signals and slow marks: the dynamics of histone modifications." Trends in Biochemical Sciences vol.31:11. Nov. 2010 (618-626).

Campbell and Reese's Biology, 7th Edition

Nelson and Cox's Lehninger Principles of Biochemistry, 5th Edition Telomeres (from the Greek telos, "an end") are long stretches of repeating non-coding DNA sequences at the ends of the DNA strand. They protect the ends of DNA and prevent DNA strands from shortening or attaching to other molecules by masking the chromosome. Russian Alexei Olonikov was the first to postulate the problem of chromosomes replicating at the tip.[1] He theorized that in every subsequent replication bits of the DNA would be lost until a critical limit had been reached, thereupon cell division would cease.


Telomerase adding Telomere extension

Telomerase is an enzyme that creates the Telomeres. Telomerase adds specific repeating sequences ("TTAGGG" in all vertebrates) to the ends of four DNA strands.

The telomerase enzyme has an RNA template that partially attaches to the shortened end of the DNA strand. New nucleic acids then attach to the template, extending the DNA strand. Once the telomerase leaves, the double stranded DNA is completed with the DNA polymerase. Telomerase was discovered in 1985 by Carol W. Greider and Elizabeth Blackburn. For this discovery, they were awarded the 2009 Nobel Prize in Physiology or Medicine along with Jack W. Szostak.[3]

Szostak and Blackburn first discovered telomeres in ciliates. They chose ciliates because at one stage of their life cycle, they make a million new telomeres. The model created includes a telomere-dedicated DNA polymerase, which adds telomeric repeats onto chromosome ends. Therefore, telomeres are represented as a motif in DNA sequences.

Telomerase's presence in humans is somewhat strange. It is located in the nucleus which is unsurprising because that is where DNA replication takes place. However, Telomerase activity is not present in all cells. It was found to be almost absent in the majority of normal adult tissues, including cardiac and skeletal muscle, lung, liver, and kidney. Because of this curious lack of telomerase activity, a theory arose connecting telomere length to aging and cell senescence. According to this theory, human somatic cells are born with a full number of telomeric repeats, but the telomerase enzyme is not present in some tissues. The cells of those tissues would lose about 50 to 100 nucleotides from each chromosome end each time they underwent replication and division. Eventually, the telomeres would cease to exist and the chromosomes themselves would start losing nucleotides, carrying genetic defects into their next division so that neither daughter cell would be viable. Thus after a certain number of divisions a cell will not have enough nucleotides and die.[4]

Telomeres at the end of a chromosome.

The function of Telomerase is to allow for short replacements of Telomeres which are gradually lost during cell division.[5] In normal conditions without Telomerase, a cell would divide until it would hit a critical point known as the Hayflick limit.[6] In the presence of Telomerase, however, the cell has the ability to replace lost DNA and divide without limit. But this continuous growth comes with a consequence as this growth may lead eventually to cancerous cells.

While the details are not fully known, it would seem that that shortened Telomeres play a role in aging due to the erosion of the DNA over time. The questions arises whether or not Telomerase has the ability to greatly extend the lifespan of a human due to its importance in the maintenance of the Telomeres.[7]Dr. Michael Fossel, a professor of clinical medicine at Michigan State University, has expressed his views on Telomerase as a viable treatment for cell senescence.

However, several experiments have raised doubts on the ability of Telomerase as an effective anti-aging treatment. An experiment was done with mice having higher levels of Telomerase and it was discovered that they also had a higher rate of cancer which therefore led to a shorter lifespan. In addition, Telomerase favors tumorogenesis.[8] Telomerase fosters cancer development by allowing uncontrolled cell growth which eventually proliferates into tumors. In fact, Telomerase activity has been observed in approximately 90% of all human tumors which suggests that the uncontrolled growth of a cell as conveyed by Telomerase has a key role in cancer.

In addition to using Telomerase as an anti-aging treatment, Telomerase has potential as a drug target against cancer.[9] Since it is necessary for the immortality of many cancer cell types, it is believed that if a drug is able to deactivate Telomerase activity in a cell, Telomeres would shorten, mutations would happen, cell stability would decrease and cancer would be, in essence, effectively treated. Experimental drugs have been tested in mouse models and some drugs have moved onto clinical testing.


Cancer Biology


The significance of studying telomeres can be found in telomerase, which rebuilds the telomere so that the cells can keep dividing. The telomerase, however, eventually shortens the telomere, causing the cell to die. In the case of cancer cells, this enzyme builds telomeres long past the cell's average lifetime. These cells then are called to be "immortaled", since they can divide endlessly. This results in a tumor. Many researchers believe that telomere maintenance activity is characterized in most human cancer cells. Though the mechanism by which such phenomena happen has not been well understood, the discovery may reveal key elements of telomere function. Telomerase, on the other hand, is the natural enzyme used for telomere repair, highly abundant in stem cells, germ cells, hair follicles, and most cancers cells, but its expression is low or in some cases absent in somatic cells. Telomerase functions by adding bases to the ends of the telomeres. Cells with sufficient telomerase activity are considered immortal in the sense that they can divide past the Hayflick limit without entering senescence or apoptosis. For this reason, telomerase is viewed as a potential target for anti-cancer drugs such as telomestatin.

2009 Nobel Prize


The Nobel Prize 2009 in Physiology and Medicine was awarded to three scientists who have discovered how the chromosomes can be copied in a complete way during cell divisions and how they are protected against degradation. By showing that the ends of the chromosomes, telomeres, and their enzyme, telomerase, are significant in protecting the chromosomes from degradation, they identified telomerase and explained how the telomeres protect the ends of the chromosomes and built by telomerase. On the other hand, if the telomeres become shortened, cells can duplicate damaged as cancer cells. If telomerase is well maintained, conversely, telomere length is maintained and the cell does not become cancerous. In the case of cancer cells, telomerase allows the cell to divide without any limit. Certain genetic disease are caused by a defective telomerase. This discovery can thus be used to stimulate the development of new therapeutic strategies. Understanding such fundamental mechanism is an important first step toward opening new doors for cures for cancer and other related diseases, as well as anti-aging.

Hayflick Limit


The Hayflick limit is the number of times a normal cell may divide until it reaches a critical limit and stops dividing based on the idea that Telomeres reach a critical length.[10] This limit was discovered by Leonard Hayflick in the 1960s who demonstrated that the cells in a normal fetus divided around 40 to 60 times before entering into cell senescence. Due to repeated mitosis, the Telomere shortening occurred which inhibited cell division which is analogous to aging. The discovery of this limit, a pillar of Biology, refuted the early contention by Alexis Carrel who, along with the majority of scientists during that time period, believed cells were "immortal".

Role of Telomere


Telomeres account for the lost bits of DNA at the ends of chromosomes during DNA replication. Since DNA polymerase moves along the template strand in the 5'--> 3' direction, some of the 5' end of the template strand will not be replicated. This results in the incomplete ends as shown in the diagram below. However, telomeres are usually very long, ranging from 400 to 600 base pairs in yeast to many kilobases in humans. They are made of six to eight base pair long repeats which are usually rich with guanine bases. With long stretches of telomeres at the ends of DNA strands, the incomplete strands of DNA will still contain the genetic code.


Guanosine Tetraplex: a structure of DNA with four strands of DNA. Often the structure of telomere.

The shortening of telomeres in humans induces cell senescence in humans. This mechanism appears to cause the formation of cancerous cells. Telomere length has been theorized in recent publications to account for the aging in humans. Since cells replicate identically, there must be a reason why cells within a body lose function and viability with time. Telomeres may have some influence over the aging process since every consequent DNA replication results in the shortening of telomeres. Two aspects to this question are: (i) whether telomere length, as measured in specific cell populations in the body, correlates with longevity or disease; and (ii) whether telomere shortening in any cell population causes functional impairment of that cell population. However, some may argue telomeres do not correlate to longevity as mice contain long strands of telomeres, but contrarily live much shorter lives than humans who do not have as long telomeres as do mice. And some may argue that telomere length does correlate to longevity as it determines the number of times that a cell can divide before it dies or reaches senescence.

Recent Publications


Recently it has been found that telomerase activity is inversely related to length of the telomeres. In other words, telomere elongation happens more often on short telomeres rather than long ones. The research showed a deficiency in telomerase activity in telomeres greater than 125 base pairs,and there was 2 to 3 times more telomerase activity in telomeres shorter than 125 base pairs. This preferential elongation has been demonstrated in yeast and mice, and now human somatic cells. Kinetic data indicates that elongation in yeast cells in a single event in which elongates the telomeres to a certain length, whereas in human cells the elongation seems to be a gradual process. The researchers showed that telomerase adds a regulated length of telomere in each cell division. The researchers showed that human cells expressed telomerase, however long telomeres were maintained and not elongated where as the cells with shorter telomeres elongated, which goes to show that telomeres can not be infinitely extended.[11]

Another interesting paper was focused on the role of DNA damage response (DDR) proteins in the role of telomere maintenance. The review says that early stage DNA repair proteins have a significant role in telomere maintenance where as late stage proteins usually do not take part in telomere repair. The interplay with these proteins and the proteins that cap the telomeres to protect the telomeres is very important too. Many of stronger DDR proteins inhibit cell replication, because of this fact, it would be harmful to the organism for these proteins to be a part of telomere repair. These protein caps on the telomeres inhibit full DNA damage response which keeps the stronger protein from "repairing" the telomere ends. It still isn't clear why some of the DDR proteins participate in telomere maintenance and others do not, but it is clear that the cellular process in repairing a DNA break and repairing telomeres are two different process, with the former halting cellular division.[12]


  1. "Telomeres, telomerase, and aging: Origin of the theory". Alexey M. OlovnikovE-mail The Corresponding Author. 1999. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  2. "Repeat Expansion–Detection Analysis of Telomeric Uninterrupted (TTAGGG)n Arrays". [1]. 2007. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  3. "The Nobel Prize in Physiology or Medicine 2009". [2]. 2009. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  4. "What are telomeres and telomerase?". [3]. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  5. "Telomerase: regulation, function and transformation". [4]. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  6. "Hayflick Limit Theory". [5]. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  7. "Extension of Life-Span by Introduction of Telomerase into Normal Human Cells". [6]. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  8. "Anti-Aging Medicine". João Pedro de Magalhães. 2008. Retrieved 2009-11-05. {{cite web}}: External link in |publisher= (help)
  9. Foreman, Judy. "Telomerase - a Promising Cancer Drug Stuck in Patent Hell?". Retrieved 2009-11-05.
  10. "Cellular Senescence". João Pedro de Magalhães. 2008. Retrieved 2009-11-17. {{cite web}}: External link in |publisher= (help)
  11. Britt-Compton, Bethan; Capper, Rebecca; Rowson, Jan; Baird, Duncan M. (2009). FEBS Letters (583): 3076–3080. {{cite journal}}: Missing or empty |title= (help)
  12. Lyndall, David (2009). The EMBO Journal (28): 2174–2187. {{cite journal}}: Missing or empty |title= (help)

DNA does not always take the form of a double helix. It can often be found creating structures considered abnormal when compared to what is commonly considered DNA. Normally, DNA contains a B-form helix. Improper formation of base pairs can greatly affect DNA's structure and flexibility.

Single-stranded nucleic acids can form hairpins. Such formations can affect the transcription terminations in prokaryotes. With regard to double-stranded DNA, they can form something called cruciforms.



Hairpin loops are formed by a fold in a single strand of DNA, causing several bases to remain unpaired before the strand loops back upon itself. A hairpin loop is only possible if the strand of DNA contains the complimentary bases in correct sequence to those that appear earlier in the strand. For example; if a DNA strand contained CCGT followed by several bases including ACGG, the strand is capable of creating a hairpin loop by folding back on itself.

Hairpin loops can occur in both DNA and RNA, though in RNA the thymine base is replaced by uracil. The number of bases in the loop itself is variable, though it never exists in the length of three bases, as the steric hindrance makes the configuration too unstable.

Here is an image example of hairpin DNA: (Image is of a Long-alpha hairpin)  



Cruciform DNA structure appears as several hairpin loops, creating a crucifix-like structure composed of DNA.

DNA structure is formed by incomplete exchange of the strands between the double-stranded helices.

Cruciform DNA Eukaryotic cells contain DNA-binding protein that can specifically recognize cruciform DNA. Interactions with ubiquitous protein plays a crucial role for the conformation of cruciform DNA.

An example of a DNA-binding Protein is Crp1p. This DNA-binding protein is found in the yeast Saccharomyces cerevisiae

Image of the formation of Cruciform DNA can be found here.

Triple Helix


The triple helix form of DNA is similar to the double helix DNA except that it contains another oligonucleotide that hydrogen bonds to the bases that are already included in the double helix strands of DNA.

The triple-stranded DNA was a very common hypothesis in the 1950s when scientists were having trouble figuring out the true structure of DNA. Watson and Crick, Pauling and Corey all published a triple-helix model proposal. Watson and Crick found problems with the model. The problems were as follows:

  1. Negatively charged phosphates near the axis will repel each other, leaving the question as to how the three-chain structure would stay together.
  2. In a triple-helix model (specifically Pauling and Corey's model), some of the van der Waals distances appear to be too small.1

For more information on Triple-stranded DNA see DNA Triple-stranded DNA

An image of the triple helix form can be found here.

Hinged DNA


Hinged DNA (H-DNA) is a triple helix structure that exists based on hydrogen bonds between DNA bases. The three strands base pair by Hoogsteen base pairing. Hoogsteeen base pairing is a variation of base-pairing in the nucleic acids such as the A-T pair or the G-C pair. The Hoogsteen base pair applies the "N7 position of purine base and c6 amino group which bind the Waston-Crick face of pyrimidine base." More information on the Hoogsteen base pair can be found here. It is also called H-DNA because of its dependence on hydrogen bonds. The H-DNA can be found in vitro or during recombination and also in DNA repair.

An example of H-DNA can be found here.



G-quadruplexes are a family of quadruple-stranded structures formed by a guanine-rich sequences of nucleic acids. Members of this family share a common square arrangement of four guanines centered around a monovalent cation and stabilized by Hoogsteen hydrogen bonding. The guanines may adopt either an anti or syn alignment about the glycosidic bond. The backbone strands of the g-tetrad can also adopt a variety of directionalities: all four strands may be oriented in the same direction, three strands are oriented in one direction while the fourth is in another direction, two adjacent strands can be oriented in one direction while the other two will be oriented in another direction, or each strand will have adjacent anti parallel neighbors. The sequence of amino acid that has the potential to form g-quadruplex is: GxNaGxNbGxNcGx, where x is the number of G residues and Na, Nb, and Nc are loops of different lengths. Furthermore, they can form in DNA, RNA, LNA, and PNA, and either be intramolecular, bimolecular and tetramolecular compounds. Their four stranded motifs create four grooves each with varying widths and depths. Their folding depends on many factors; DNA sequence, presence of ions, temperatures, and presence of various ligands. They are a special area of interest due to their biological implications specifically in telomeres and as contributors to gene regulation.

A shows a G-tetrad, B shows the Anti and Syn conformations of Guanine, C shows the various directionailities of the backbone strands, D shows the different types of loops

Structure determination of G-quadruplex based on crystallography or solution NMR demonstrates significant deviations in conformation and loop geometry suggesting heterogeneity in strand topology and loop conformation of G-quadruplexes. Varying conformations can result in varying stability. Furthermore, studies of the various conformations reveal that the nature of the loop sequence and the formation of interactions between loops and the quadruplex core are important elements in controlling quadruplex topology and stability. For example, in examining the bindinging of quinacridine-based ligand to a G-quadruplex, interactions with the sides of the G-stack do not alter the topology but interaction with the loop sequence ended up altering the conformation of the loops. This hints at the notion that the loop sequences of the quadruplex are what actually moderate the binding affinity and specificity of the whole structure.

The four-stranded structure with four grooves instead of the normal two found in typical DNA structure, provides a variety of surfaces for interactions with ligands. Aromatic compounds of various dimensions showed favorable interactions with the planer surfaces of terminal guanine tetrads. Intercalation between layers of G-tetrads does not occur, however because G-tetrads do not allow for bulky aromatic compounds to insert itself between layers of guanine.

In eukaryotic telomeres, there exists repeats of g-rich sequences that can fold into g tetrads. It has been postulated that this structure plays an important role in cell aging and human diseases such as cancer, then making them targets to anticancer drugs.


  1. Problems with triple helix model: <>
  2. H-DNA: <>
  3. Cruciform DNA: <>
  4. Sannohe,Yuta, Sugiyama, Hiroshi. "Overview of Formation of G-Quadruplex Structures" Wiley Online Library. 01 Mar. 2010. 20 Nov. 2010.
  5. Martin Egli, Pradeep S Pallan. "The many twists and turns of DNA: template, telomere, tool, and target" Current Opinion in Structural Biology. 08 Apr. 2010. 20 Nov. 2010
  6. Lubos Bauer,Peter Javorsky, Katarina Tluckova, and Viktor Viglasky. "Evaluation of human telomeric g-quadruplexes: the influence of overhanging sequences on quadruplex stability and folding" Journal of Nucleic Acid. 10 Jun. 2010. 20 Nov. 2010


relaxed DNA molecule
negative supercoil

The structure of DNA does not only exist as secondary structures such as double helices, but it can fold up on itself to form tertiary structures by supercoiling. Supercoiling allows for the compact packing of circular DNA. Circular DNA still exists as a double helix, but is considered a closed molecule because it is connected in a circular form. A superhelix is formed when the double helix is further coiled around an axis and crosses itself. Supercoiling not only allows for a compact form of DNA, but the extent of coiling also affects the DNA’s interactions with other molecules by determining the ability of the double helix to unwind.

Although the supercoiling provides an organized way to tightly compact DNA, the structure is relatively unstable as a result of torsional strain. In order to minimize the energy required to maintain the structure, the number of twists and writhes are minimized. Twists refer to the number of turns the double helix makes around the superhelical axis. Writhes refer to the circular distortion, bending, and overall non-planarity of the DNA strand.

Supercoiling changes the shape of DNA. The benefit of a supercoiled DNA molecule is its compactibility. In comparison to a relaxed DNA molecule of the same length, a supercoiled DNA is more compact. How this is reflected in experimentation is that supercoiled DNA moves faster than relaxed DNA. Therefore, the structural differences can be analyzed in techniques such as electrophoresis and centrifugation.

Supercoiled DNA may hinder and favor the DNA to unwind and thus affect the interaction between DNA and other molecules in cells.

Positive and Negative Supercoilings


1. Negative supercoiling is the right-handed coiling of DNA thus winding occurs in the counterclockwise direction. It is also known as the "underwinding" of DNA.

2. Positive supercoiling is the left-handed, coiling of DNA thus winding occurs in the clockwise direction. This process is also known as the "overwinding" of DNA. (CORRECTION FIXED on 10/23/17 - DV, original error had the negative and positive supercoiling definitions reversed. Also provided more basic clarity to supercoiling).

Although the helix is underwound and has low twisting stress, negative supercoil's knot has high twisting stress. Prokaryotes and Eukaryotes usually have negative supercoiled DNA. Negative supercoiling is naturally prevalent because negative supercoiling prepares the molecule for processes that require separation of the DNA strands. For example, negative supercoiling would be advantageous in replication because it is easier to unwind whereas positive supercoiling is more condensed and would make separation difficult.

Topoisomerases unwind helix to do DNA transcription and DNA replication. After the proteins have been made,the DNA template supercoils by the force to make chromatin. RNA polymerase also influence DNA strand to have two different supercoiled directions. The region RNA polymerase has passed forms negative supercoil while the region RNA polymerase that have not passed forms positive supercoil. By these processes, supercoils are generated.



Topoisomerases are enzymes that are responsible for the introduction and elimination of supercoils. Positive and negative supercoils require two different topoisomerases. This prevents the distortion of DNA by the specificity of the topoisomerases. The two classes of topoisomerases are Type I and Type II. Type I stimulates the relaxation of supercoiled DNA and Type II uses the energy from ATP hydrolysis to add negative supercoils to DNA. Both of these classes of topoisomerases have important roles in DNA transcription, DNA replication, and recombinant DNA.

Topoisomerase form loops (unwinded regions of the double helix) of negative supercoils. If the DNA lacks superhelical tension, there is no unwinding of supercoils.

Type I topoisomerase


Type I topoisomerase act by creating transient single-strand breaks in DNA. This is further classified as type IA and type IB.

Type IA topoisomerases

Type IA topoisomerases enzyme is a 695-residue monomer and it relaxes negatively supercoiled DNA. First, Type IA cuts a single stranded DNA and catenates two circles of single stranded DNA. Then it unwinds the supercoiled duplex DNA by one turn. Type IA has a specific strand-passage mechanism which is the denaturation of type IA incubated with single stranded DNA that yields a linear DNA by phospho-Tyr diester linkage.

Type IB topoisomerases

Type IB mediates a controlled rotation mechanism to relax both negative and positive supercoils. Type IB cleaves a single strand of a duplex DNA through the nucleophilic attack of an active site with Tyr on a DNA to yield a 3'-linked phospho-Tyr intermediate with 5'-OH group. Type IB consists of several domains and subdomains. Interestingly, type IA topoisomerases form a covalent intermediate with the 5' end of DNA, while the IB topoisomerases form a covalent intermediate with the 3' end of DNA. Historically, type IB topoisomerases were referred to as Eukaryotic Topo I, but Type IB topoisomerases are present in all three kingdoms of life.

Type II topoisomerase


Type II topoisomerase is an enzyme that require ATP hydrolysis to complete a reaction cycle in which two DNA strands are cleaved, duplex DNA is passed through the break and the break is resealed. Type II cuts both strands a DNA double helix, passes another unbroken DNA strand through it, and then reanneals the cut strand. It is also split into two subclasses: type IIA and type IIB topoisomerases, which share similar structure and mechanisms. Examples of type IIA topoisomerases include eukaryotic topo II, E. coli gyrase, and E. coli topo IV. Examples of type IIB topoisomerase include topo VI. Supercoiling requires energy because it is torsionally strained. Thus, through the coupling to ATP hydrolysis it can introduce negative supercoils.

In bacteria, Type II topoisomerase is also known as DNA gyrase. Gyrase is an enzyme that acts similarly to human Type II topoisomerase. Antibiotics act on bacterial enzyme by blocking the binding of ATP to gyrase and thus deactivating the breaking and joining of bacterial DNA chains.



Nucleosomes allow for the compact packing of linear DNA. Nucleosomes are complexes of DNA and histones, consisting of ~145 base pairs of DNA wrapped around in a left-handed superhelix around a histone octomer, which are a group of small proteins. Histones contain a large amount of positively charged amino acids such as lysine and arginine which allow them to bind to the negatively charged DNA molecule. The histone octamer is composed of two copies each of H2A, H2B, H3, and H4. The two loops of DNA around the histone are attached to the histone also using the H1 histone. Nucleosomes are further arranged in a stacked helical complex. Through the extensive wrapping of DNA around the histones, as well is the helical arrangement of the nucleosomes, the linear DNA is able to be compacted. The structural folding of the nucleosomes eventually forms a chromosome.

Chromatin refers to the structure of DNA and its accompanying histones. Chromatin is composed of repeating units called nucleosomes. The five major histones found in chromatin are H2A, H2B, H3, H4, and H1.

In gene clusters, protein genes of histone are present and these are expressed in S phase. Once it is expressed, it forms histone octamers. With interactions of 146 base pairs of DNA double helix, histone octamer becomes a nucleosome. When histones bind to DNA, it is depended on the amino acid sequence of histone, not the nucleotide sequences of DNA.

Nucleosome core particle

Histones and Transcription Regulation


Histones always appear to remain attached to the DNA even through transcription. The fact that nucleosomes are able to change shape and position allow for transcription to occur and RNA polymerase to move along the DNA strand. Slight loosening of the binding between the histones and DNA are accomplished by acetylation of the histones, which neutralizes the positively charged residues. Meanwhile, binding is made tighter through methylation to restore the positive charge of the histones. By changing the charge of the histones in this manner, gene transcription can be regulated. Histone Chaperones are proteins that mediate the assembly and disassembly of the chromatin to form correct nucleosomes sequences and aid in stable folding conformations. These proteins function to protect and shield the histones from forming incorrect and unwanted aggregates with DNA because of the high ionic strength that exists between DNA and Histones. DNA is primarily negatively charged molecule and histones are positively charged therefore, there exists a strong affinity for each other. Histone Chaperones, which are positively charged, help to guide histones to form octamers and correctly bind to DNA by shielding and masking the negative charge of DNA. There are different types of histone chaperones, including β- sandwich, α/α earmuff, Β-propeller and β-barrel chaperones. Β-sandwich chaperones are chaperone monomers that form β-sheets with the histones. An example of these types of chaperones is ASF1 or anti-silencing function chaperones involved the overexpression during yeast replication. In addition ASF 1 is the first histone used during assembly of the chromatin. α/α earmuff chaperones are dimers that form α helical conformations of histone/DNA complex. An example would include NAP chaperones which are used to transport histones from the cytoplasm to the nucleus during chromatin assembly. Β-propeller chaperones were the first chaperones to be distinguished using NMR and crystallography techniques. These pentamer chaperones function is the storage of histones. Β-Barrel chaperones are heteroligomers that help facilitate chromatin transcription. In addition, there are irregular or variant histone chaperones that do not fit into any specific structural category. All of these different types of chaperones are involved in different stages of assembly of disassembly of chromatin. The energetics of Chromatin assembly and disassembly are regulated by histone chaperones. Assembly which is an energetically favored process because as the histones bind with DNA it forms a more stable structure causing a decrease in energy. On the other hand, disassembly is an energetically unflavored process needing the use of ATP to break apart the stable histone/ DNA interactions.

Nucleosome Sliding


Nucleosome sliding is a frequent result of energy-dependent nucleosome remodelling in vitro.

ATP-Dependent Nucleosome Sliding Mechanism

The paper “Mechanisms of ATP-dependent nucleosome sliding” by Gregory D Bowman, researches how ATPase motors engage and manipulates nucleosomal DNA and discusses possible mechanisms for ATP-dependent sliding of nucleosomes. ATPase motors are shared between chromatin remodelers and collections of different protein machines. The ATPase motor generates torsional strain when it engages with DNA at an internal site on the nucleosome. The torsional strain in the nucleosomal DNA is a result of the ATPase motor acting at SHL2 region. Protection of nucleosomal DNA between SHL2 and the entry/exit site is increased Isw2 ATPase is activated. ATP-dependent crosslinking of the Isw2-subunit Dbp4 to SHL4 promotes hydrolysis-dependent changes. Iswi-type remodelers form template-committed complexes that allow for nucleosomes to slide processively.

Bowman also explains possible variations of the bulge/loop propagation model using ATPase motors. One model suggests that the ATPase motor uses translocase abilities to pull DNA from an entry/exit site in a continuous manner. This pumping allows for a remodeler to create a bulge that would rapidly diffuse to a distant entry/exit site. Another model suggests the histone-DNA contacts are disrupted by a DNA loop that is developed by a remodeler ATPase around the SHL2 region. This disruption pulls DNA for the linker and the ATPase motor would move toward dyad along the DNA loop.

Chromatin Remodeler


Chromatin remodelers are mainly involved in DNA packaging and facilitating the transcript elongation process. For example, when a DNA strand coils with nucleosomes for packaging into chromatin, chromatin remodelers arrange the nucleosomes in a regular distance for effective condensation of DNA strands. Furthermore, in some processes where nucleosomes have to be modified, chromatin remodelers may disassemble the nucleosome into histones or even detach the whole nucleosome from the DNA. The processes that require nucleosome modification by the chromatin remodeler include DNA repair, recombination, transcription and replication. The following picture displays an example of how a chromatin remodeler may be used during transcription catalyzed by RNA Polymerase II.


Palindromic Sequencing


A palindromic sequence is a sequence made up of nucleic acids within double helix of DNA and/or RNA that is the same when read from 5’ to 3’ on one strand and 5’ to 3' on the other, complementary, strand. It is also known as a palindrome or an inverted-reverse sequence.

The pairing of nucleotides within the DNA double-helix is complementary which consist of Adenine (A) pairing with either Thymine (T) in DNA or Uracil (U) in RNA, while Cytosine (C) pairs with Guanine (G). So if a sequence is palindromic, the nucleotide sequence of one strand would be the same as its reverse complementary strand. An example of a palindromic sequence is 5’-GGATCC-3’, which has a complementary strand, 3’-CCTAGG-5’. This is the sequence where the restriction endonuclease, BamHI, binds to and cleaves at a specific cleavage site. When the complementary strand is read backwards, the sequence is 5’-GGATCC-3’ which is identical to the first one, making it a palindromic sequence.

Another restriction enzyme called EcoR1 recognizes and cleaves the following palindromic sequence:

5’ – G A A T T C – 3’
3’ – C T T A A G – 5’


Image of a palindrome in a DNA structure. A = Palindrome, B = Loop, C = Stem

Relationship between Sequence and Protein Structure


There have been many researchers who have studied the relationship between palindromic sequences and protein structures. Studies have shown that the frequent appearances of palindromic sequences, also called palindromic peptides, in protein sequences are not just by chance. Scientists suggest that these sequences are important for protein structure and protein function in different proteins. Some of these protein groups include DNA binding proteins, ion channels and Rhodopsin, metal binding proteins and receptors, and etc. By comparing palindromes with set sequences from the database, scientists can try to find the roles of palindromic sequences.

Another topic within palindromic sequences which is being studied is whether the symmetry of palindromic sequences affects the structure and folds of peptides. One hypothesis is that by reversing the sequence, the resulting folds would be mirror-images of the original fold. The conclusion states that because both the original and reverse proteins have identical amino acid compositions which lead to similar hydrophobic-hydrophilic patterns, the reversing sequence results in the same folds as opposed to the mirror-image folds. Another hypothesis guided by research is that by reversing a sequence, the fold could change or possibly be destroyed. This shows evidence that the similarity in reverse sequencing does not reflect structural similarity, which means that they do not form symmetrical protein structures.

Effect on genomic instability in yeast


Palindromic sequences have been tied to different genomic rearrangements in different organisms depending on the length of the repeated sequences. Shorter palindromic sequences (shorter than 30 bp) are very stable while longer sequences are not stable in vivo. These sequences occur in both eukaryotes and prokaryotes. These sequences also increase inter and intrachromosomal recombination between homologous sequences. Hairpin structures can form from palindromic sequences due to base pairing in single-stranded DNA. These structures can be substrates for structure-specific nucleases and repair enzymes which can lead to a double-strand break in the DNA. This then leads to loss of genomic material which can cause meiotic recombination. Studies with a 140-bp long mutated palindromic sequence inserted in yeast have shown to lower postmeiotic segregation and increase rate of gene conversions, while shorter sequences do the opposite. Research also found that during meiosis, double-strand breaks are induced by the long 140-bp palindromic sequence. In the long hairpin structure, the entire stem-loop is not covered and the processing endonuclease is exposed, which makes nicks in the loop. This nick creates a gap which is repaired by the wild-type strand. The induction of double-strand breaks during meiosis is what causes genomic instability.

Likelihood of palindromic sequences in proteins


There have not been an abundance of studies focusing on the significance of palindromic sequences in protein, but there have been some which tell us a lot about the relationship between palindromic sequencing and protein function. But by understanding the actual formation of these palindromic sequences and their properties, researchers can tie these sequences to functions. It has been found that decreasing amino acid composition complexity increases the likelihood of a palindromic sequence. The next step relates to the likelihood of palindromic sequences in proteins which can be due to the frequent formation of alpha helices by palindromes.



(c) Acdx, from the Wikipedia Commons <>

Jankowski, Craig, Dilip K. Nag, and Farooq Nasar. "Long Palindromic Sequences Induce Double-Strand Breaks during Meiosis in Yeast." National Center for Biotechnology Information. U.S. National Library of Medicine, 20 May 2000. Web. 7 Dec. 2012. <>.

"Palindromic Sequences." Wikipedia. Wikimedia Foundation, 11 Aug. 2012. Web. 07 Dec. 2012. <>.

Sheari, Armita, Mehdi Kargar, Ali Katanforoush, Shahriar Arab, Mehdi Sadeghi, Hamid Pezeshk, Changiz Eslahchi, and Sayed-Amir Marashi. "A Tale of Two Symmetrical Tails: Structural and Functional Characteristics of Palindromes in Proteins." National Center for Biotechnology Information. U.S. National Library of Medicine, 11 June 2008. Web. 07 Dec. 2012. <>. When a DNA solution is heated enough, the double-stranded DNA unwinds and the hydrogen bonds that hold the two strands together weaken and finally break. The process of breaking double-stranded DNA into single strands is known as DNA denaturation, or DNA denaturing. The temperature at which the DNA strands are half denatured, meaning half double-stranded, half single-stranded, is called the melting temperature(Tm). The amount of strand separation, or melting, is measured by the absorbance of the DNA solution at 260nm. Nucleic acids absorb light at this wavelength because of the electronic structure in their bases, but when two strands of DNA come together, the close proximity of the bases in the two strands quenches some of this absorbance. When the two strands separate, this quenching disappears and the absorbance rises 30%-40%.This is called Hyperchromicity. The Hypochromic effect is the effect of stacked bases in a double helix absorbing less ultra-violet light.

Applications of DNA denaturation


Sequence differences between two different DNA sequences can also be detected by using DNA denaturation. DNA is heated and denatured into single-stranded state, and the mixture is cooled to permit strands to re-hybridize. Hybrid molecules are formed between similar sequences and any differences between those sequences will give a disruption of the base-pairing

What determines the Melting Temperature (Tm)?


While the ratio of G (Guanine) to C (Cytosine) and A (Adenine) to T (Thymine) in an organism's DNA is fixed, the GC content (percentage of G +C) can vary considerably from one DNA to another. The percentage of GC content of DNA has a significant effect on its Tm. Because G-C pairs form three hydrogen bonds, while A-T pairs form only two, the higher the percentage of GC content, the higher its Tm. Thus, A double-stranded DNA rich in G and C needs more energy to be broken than one that is rich in A and T, meaning higher melting temperature(Tm). Above the Tm, DNA denatures, and below it, DNA anneals. Annealing is the reverse of denaturation.

One aspect of thermal denaturation is never discussed. The heat supplied to effect such denaturation has no preferred direction and is therefore a scalar quantity. However, unwinding a double helix involves unwinding and this has direction and is therefore a vector. The issue is this: how does a scalar change induce a vector change?

Other methods to denature DNA


Heating is not the only way to denature DNA. Organic solvents such as dimethyl sulfoxide and formamide, or high pH, could break the hydrogen bonding between DNA strands too. Low salt concentration could also denature DNA double-strands by removing ions that stabilize the negative charges on the two strands from each other.

The central difficulty with denaturation of the double helix remains. How would two strands, typically consisting of many turns, and often many hundreds of turns, actually effect strand separation after the hydrogen bonds have been severed ?

A further, major difficulty lies in the fact that the application of heat to a suspension of nucleic acids amounts to the application of a scalar quantity because the heat applied in this way has no direction. However, unwinding the strands requires an angular force and this is a vector as it has a preferred direction. It has never been explained how a scalar quantity (heat) can effect a vector change (rotation) in a solution. There is simply not enough technology are intellect in this world to explain it.

A solution to these problems is offered by the side-by-side models in which there is no net winding of strands around each other. The Avery-MacLeod-McCarty Experiment was presented by Oswald Avery, Colin MacLeod, and Maclyn McCarty in 1944. During the 1930s and early 1940s, Avery and MacLeod performed this experiment at Rockefeller Institute for Medical Research, after the departure of MacLeoirulency (measure of deadly potency). This experiment would allow them to determine if rough bacteria could be transformed into smooth bacteria, hence passing along the genetic information causing the transformation. By isolating and purifying this chemical component, they could deduce if it had characteristics of a protein or DNA molecule.



The purpose behind this experiment was to better understand the chemical component that carries the genetic information and transforms one molecule to the next.



Bacteria grown in petri dishes can grow spots or colonies inside the dish multiplying under certain conditions. Virulent (deadly) colonies look smooth or like tiny droplets, where as non-deadly bacteria formed rigid, uneven edges, basically rough colonies. While analyzing a certain kind of pneumonia caused by bacteria in mice, they were able to isolate a "variant" (mutant) strain that did not kill the mice. During the experiments, Avery and MacLeod injected a mouse simultaneously with "boiled" or dead smooth bacteria and live rough bacteria. Thereafter a short while they were surprised to see that the mouse died. When they took samples from the dead mice, and cultured the samples in a petri dish, Avery and MacLeod found that what grew inside the culture was in fact the smooth deadly bacteria. This suggested that something from the "dead" bacteria somehow converted the rough bacteria into smooth bacteria. The rough bacteria had been permanently converted or transformed into the smooth dangerous bacteria. They had confirmed that they could not grow smooth bacteria from the boiled culture and cause disease if the dead smooth bacteria were injected alone. This all implied that a chemical component in the smooth bacteria survived and transformed the rough bacteria into smooth. Isolating and purifying that chemical component had shown that is was DNA, NOT proteins that transferred the genetic code from the smooth to the rough.

Simpler Experimental

Avery-MacLeod-McCarty Experiment

Here is a simpler demonstration for this experiment by Oswald Avery, Colin MacLeod, and Maclyn McCarty. There are two sets of bacteria – one is smooth (virulent), one is rough (nonvirulent).

1) They first inject deadly encapsulated bacteria into the mouse – the mouse dies at the end.

2) They then inject non-encapsulated, nonvirulent bacteria into the mouse – the mouse lives.

3) Next, they heated the virulent bacteria at a temperature that kills them and injected these bacteria into the mouse – the mouse lives.

4) After that, they then have the denatured fatal bacteria mix into the living non-encapsulated, nonfatal bacteria. The mixture was then injected into the mouse – the mouse dies.

5) Finally, they mix the live, non-encapsulated harmless bacteria with the DNA that was extracted from the heated, lethal bacteria. These “harmless” bacteria injected to the mouse after being mixed – the mouse dies.

From these experiments, Avery and his group showed that nonvirulent bacteria become deadly after mixing with the DNA of the virulent bacteria . Such a demonstration shows that nonvirulent bacteria became virulent because of the genetic information that originally came from the virulent bacteria. The protein from the virulent bacteria was already denatured during Step 3. Thus, it was DNA and not protein that transferred the genetic information to the nonvirulent bacteria.

Griffith Experiment

In 1928, Frederick Griffith performed a DNA experiment using pneumonia bacteria and mice. This experiment provided evidence that some particular chemical within cells is genetic material. The objective of the experiment was to find the material within the cells responsible for the genetic codes.

For the experiment, Griffith used Streptococcus pneumoniae, known as pneumonia. Pneumonia contains two strains - a smooth and a rough strain. The smooth strain causes pneumonia and contains a polysaccharide coating around it. The rough strain does not cause pneumonia and also lacks a polysaccharide coating. For his first experiment, Griffith took the S strain (smooth strain) and injected it into the mice. He found that the mice contracted pneumonia and ended up dying. He then took the R strain (rough strain) and injected it into the mice and found that they did not contract the pneumonia illness and survived the insertion of the strain. Through these first two experiments Griffith concluded that the polysaccharide coating on the bacteria somehow caused the pneumonia illness, so he used heat to kill the bacteria (polysaccharides are prone to heat) of the S strain and injected the dead bacteria into the mice. He found that the mice lived, which indicated that the polysaccharide coating was not what caused the disease, but rather something living inside the cell. Then he hypothesized that the heat used to kill the bacteria denatured a protein within the living cells, which caused the disease. He then injected the mice with a heat killed S strain and a live R strain, which resulted in the mice dying.

Griffith performed a necropsy on the dead mice and isolated the S strain bacteria from the corpses. He concluded that the live R strain bacteria must have absorbed the genetic material from the dead S strain bacteria, which is called transformation, a process where one strain of a bacterium absorbs genetic material from another strain of bacteria and turns into the type of bacterium whose genetic material it absorbed. Since heat denatures proteins, the protein in the bacterial chromosomes was not the genetic material. However, evidence pointed to DNA. This experiment that Griffith performed was a precursor to the Avery experiment. Avery, Macleod and McCarty followed up on the experiment because they wanted a more definitive experiment and answer.

Avery, MacLeod and McCarty used heat to kill the virulent Streptococcus pneumonia bacteria and extracted RNA, DNA, carbohydrates, lipids and proteins - which were considered possible candidates for the carriers of genetic information - from the dead cells. Each molecule was added to a culture of live non-virulent bacteria to determine which was responsible for changing them into virulent bacteria. DNA was the only molecule that turned the non-virulent cells into virulent cells, which they concluded was the genetic material within cells.



Lockshin, Richard A., The Joy of Science 2007.



In 1944 Oswald Avery and colleagues did an experiment involving the use of pathogenic bacteria to determine the material that contained genetic information. Their experiment concluded that it was DNA and not proteins that is the hereditary material. Despite the findings, the popular and widely accepted conclusion remained that protein encoded genetic information, accounting for its diversity in function and much greater number compared to DNA. In order to gain more evidence on DNA scientists by the name Alfred Hershey and Martha Chase decided to perform a simple but effective experiment involving bacteriophages.



In order to understand the experiment that was performed we must examine first the vector used which played a crucial role in the experiment – Bacteriophages. Bacteriophages are types of viruses which infect bacteria such as Escherichia coli. They consist of a protein coat, collar, base plate, tail fiber and most importantly a head which houses the genetic material. They have a unique feature, which makes them the perfect candidate to prove whether DNA or proteins house the genetic information. They have an outer capsule of proteins, which surround an inner core of DNA. Bacteriophages, being viruses, are unable to proliferate on their own accord, as they lack the necessary system to do so. Viruses invade a host cell and inject their genetic material to the host’s own gene and allow the host to replicate the viral gene. Knowing this, Martha and Hershey Chase saw that if they labeled the bacteriophages they would be able to track what genetic information is passed on to the host bacteria - the labeled DNA or the labeled protein coat.



A bacteriophage was taken and its encased DNA was labeled with radioactive 32P and its protein coat was left nonradioactive. The bacteriophage was exposed to a sample of bacteria. The phage attached to the surface of the bacterial cell and injected the labeled DNA. The sample was then chilled to arrest growth. The sample was then shaken vigorously for several minutes in a Waring Blender. This process separated the phages coat from the surface of the bacteria. The sample was then centrifuged very quickly. The bacterial cells were at the bottom of the tube and the phage particles were in the supernatant. Hershey and Chase discovered that there was no disruption in the reproduction of viral phages inside the bacteria. A new generation of viruses had successfully propagated inside the host cell and these phages exhibited 32P radioactivity in their own DNA.

Another set of Bacteriophage was then examined, this time nnznwith a radioactive protein coat 35S and a nonradioactive DNA. The same procedure was followed and the phage attached to the bacterial wall and was allowed to inject its genetic material. Vigorous shaking of the bacteria caused the radioactive viral sheath to detach from the bacteria. Injection of the viral DNA into the bacteria still occurred and new phages were observed to have been produced. However, analysis of the new phages inside the bacteria showed that it had no radioactive properties; a property which should be present in the new phages, if proteins were in fact the genetic material that is passed on to new progeny. This experiment therefore illustrated that DNA, not protein, is the source of genetic information.




The first bacterial cell contained phages with observable radioactivity illustrating that the radioactive property present on the parent phage was passed on to the new phages. The second bacterial cell however showed no hint of 35S, showing that it was removed along with the protein coat and did not enter the bacteria. Hershey and Chase then deduced that the genetic material that is being passed on is DNA and not protein as previously accepted before.

This famous 1952 experiment allowed Hershey and Chase to demonstrate that it was DNA, not protein, that functioned as the T2 phage's genetic material. Viral proteins, labeled with radioactive sulfur, remained in the ouside of the host cell during infection. In contrast, the viral DNA, which was label with radioactive phosphorus, entered the bacterial cell. Concluding that the DNA is in fact the material within cells that contains useful genetic information.

Additional Information


An animated video of the Hershey and Chase experiment can be viewed by clicking on this link

The published papers of Martha Chase and Alfred Hershey can be viewed in this link



Berg, Jeremy, John Tyzmozcko, Lubert Stryer. Biochemistry

Historical information


Determination of the DNA structure would not have been possible if it was not for the work of Erwin Chargaff, an Austro-Hungarian biochemist. Originally a scientist who did his first work in lipids and lipoproteins, after reading about an experiment of Oswald Avery which showed that DNA was material encoding the genetic information, he turned his work onto DNA.

Tetranucleotide hypothesis was the mainstream theory on Chargaff’s time which was proposed by Phoebus Levene. The theory suggested “that DNA was made up of equal amounts of four bases – adenine, guanine, cytosine, and thymine – but that it was organized in a way that was too simple to enable it to carry genetic information.” The four bases are held together by hydrogen bonds and they are located inside the DNA helical structure. However, the sugar and phosphate backbone are on the outside of the DNA structure. The two strands are complementary to each other and thus one strand depends on the other. Despite the results of Avery’s experiments that DNA encodes life the scientific community was convinced DNA was relatively too simple to carry genetic information. Chargaff was not satisfied with the tetranucleotide postulation because of the minimal data that supported it.



Erwin and his colleagues collected several DNA samples throughout their discovery. Using the fairly new technique of paper chromatography, Chargaff and his associates proceeded to separate DNA. The DNA that they collected was subjected to acid. The acid would then hydrolyze the phosphodiester bonds as it would cause a nucleophilic attack on the bond and result in the backbone breaking up. Once the phosphodiester bonds were broken then the individual nucleotides would then be separated and be free to analyze. Ultraviolet spectrophotometry was used to analyze the exact amounts of bases that were present in the DNA sample.

UV spectrophotometry showed that there was not an equal amount of purine bases (Adenine and Guanine) and pyrimidine bases (Cytosine and Thymine). Chargaff and his partners showed that the tetranucleotide hypothesis was in fact wrong in assuming that all four bases were in equal amounts. In other words, the concentration of GC equals to the concentration of AT. However, in RNA, Thymine is replaced with Uracil. What Chargaff noticed however was that although not all were in equal amounts certain bases were equal to each other. The base Guanine was equivalent to the amount of cytosine present; and the same held true for Adenine and Thymine. The ratio of A/T and C/G bases held true for all organisms and for both of the strands that were separated. The noticeable proportionality between one purine base to another pyrimidine base as well as it being true for both strands would be crucial in determining the helical structure of DNA although Chargaff was unable to see it.

The experiment gave two discoveries which is now summarized as Chargaff’s Rule:

1. The number of Adenine bases is equal to the number of Thymine bases, and number of Cytosine bases is equal to Guanine bases. Ratio of A=T Ratio of C=G Ratio of A + T +C +G = 100%

2. The proportion of A:T and C:G holds true for both strands.

For example: in human DNA, the four bases Adenine (A), Thymine (T), Cytosine (C), and Guanine (G) are present in these percentages: A= 30.9% and T= 29.4%; G=19.9% and C=19.8%. The A=T and G=C equalities, displays Chargaff's Rule, which actually remained unexplained until the discovery of the double helix by Watson and Crick.



Berg, Jeremy. Biochemistry. 6th edition. ISBN-13 9780716787242

Campbell,Neil. Biology. Pearson Publishing. Dec 2004

Watson, James. DNA : The Secret of Life. Knopf Publishing Group. Aug 2004.

Inspirations to the Discovery of DNA Structure


James Watson began his research on DNA structure when he was in college. In 1945, during his third year of college, he reads Erwin Schrödinger's What Is Life? and takes away the message: Genes are the key components of living cells, so "we must know how genes act". In 1950 at an international conference in Naples, Maurice Wilkins of King's College, London, shows his clear X-ray pictures of DNA to Watson. Determined to work on DNA structure, Watson moves to Sir Lawrence Bragg's biophysics unit of the Cavendish Laboratory at Cambridge, England, where he meets biophysicist Francis Crick. Many scientists, including Rosalind Franklin, began her research on DNA structure with the help of X-ray diffraction. During the same year, she held a seminar at King's College in London, where Watson was invited. Her X-ray photo revealed the physical structure of DNA as a helix. During the seminar, Watson learned that Franklin's research confirmed that DNA had a helical structure, which consisted of two to four interlaced helical chains. Each helix had a phosphate-sugar backbone, with attached bases (adenine, guanine, thymine, and cytosine). The bases were proved to attached to the inside of the helix, possibly forming links between the helical chains. After Franklin's seminar, Watson decided to build DNA models.

Continuation on the Discovery


Nevertheless, the diffraction pictures of these models did not fit that of real DNA. The models that Watson built turned out to be wrong—the bases are on the outside of the helix and the helix is dehydrated—because he misinterpreted Franklin's findings. Watson's and Crick's research on DNA structures was terminated by King's College, and they must continue with their previous researches, which are tobacco mosaic virus (TMV) for Watson and proteins for Crick.

Although banned from researching on the structure of DNA, Watson was able to continue because one of the main components of TMV was nucleic acid and Francis Crick continued with it outside of his research. In 1952, Watson described Alfred Hershey's discovery that the genetic material of viruses is DNA, comparing the DNA in virus heads to "a hat in a hatbox". Watson and Crick had a disastrous meeting with Erwin Chargaff of Columbia University, who had discovered the ratios of the amount of the DNA bases. From John Griffiths, the nephew of Fred Griffiths who contributed to the fact that DNA is a genetic carrier, Crick learned that guanine (G) is attracted to cytosine (C), and adenine (A) to thymine (T), and Crick deduced that the bases must fit together like two interleaved decks of cards—they were stacked on top of one another inside the entwined backbones.

Watson was convinced that DNA must be helical due to Crick's proposed DNA structure and the X-ray diffraction plates. When Franklin showed Crick and Watson the X-ray pictures of DNA, even though the pictures did not show the radial symmetry necessary for helices, they show that the crystals were overlapping.

In autumn of 1952, Watson became friends with Linus Pauling's son, Peter. At that time, Linus Pauling was one of the few men in the scientific community who pondered the importance of DNA structure. From Peter, Watson learned that Linus Pauling published a paper on DNA structure—there are three helically entwined chains with sugar phosphate backbone outside of the coil, and the outdated X-ray pictures "proved" the structure to be true. Such structure is known as alpha helix. Watson immediately knew that Pauling's structure was incorrect because of the previous models that Watson had built. In fact, Pauling's structure left out important details: he had omitted to assign ionization charges on the phosphate groups. When there is no electric charge holding the long thin chains together, the chains would unravel and fall apart; without the charges, the nucleic acid structure was not even an acid.

Watson and Crick knew that Linus Pauling was their main competitor in determining the structure of DNA. Knowing that one of the greatest scientists made several mistakes in deducing DNA structures, Watson and Crick resolved to tackle the DNA structure at Cavendish laboratory. They worked on the DNA model using metal plates and Franklin's pictures of DNA by X-ray crystallography, provided by Maurice Wilkins and Max Perutz. Besides matching the bases, they also determined that the width between the two DNA strands must be less than two nanometers. In order to fit the bases inside the strands, Watson believed that the base pairs that are alike should be put together. However, they were unable to fit the similar bases within a small width. Watson then discovered that the keto-form base pairs joined A-T and C-G, and now the base pairs are able to fit inside the double strands. In five weeks of time, Watson and Crick built a DNA model that is indeed the correct structure of DNA.

In April 25, 1953, Watson and Crick published their article "Molecular structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid" in Nature, becoming the first to publish the structure of DNA as a double helix.

Importance of Discovery


This discovery shed light on how genetic material could be passed on from generation to generation, and proves the simplicity of the transfer of genetic material. In fact, our present understanding of the storage and utilization of a cell’s genetic information is based on work made possible by this discovery.

DNA Structure – Leading to Function


After looking at the X-ray crystallography made by Rosalind Franklin, Watson and Crick were able to deduce that the shape of DNA was a double helix, and by Chargaff’s experiment, were able to deduce that the G pairs with C and A pairs with T. The pairs’ base lengths are equal, and fit exactly between the two chains of phosphates. The bonds between the two phosphate groups are hypothesized by Watson and Crick to be hydrogen bonds, which are easily broken. The discovery of DNA structure thus gave them a very good idea on how DNA might replicate itself, and thus the passing of genetic material.

Within Watson and Crick's article they claim that DNA is a double helical structure and that Pauling's previous attempt to define the structure noting that it did not have the much needed hydrogen bond stabilization and underestimated the van der Waals interactions of base stacking. The helix would of right handed as the two chains run in opposite directions. Bases were linked towards the inside of the helix and the sugar phosphate linkage created the outer backbone. The helix would repeat every 10 residues or 3.4 Angstroms, as they saw in the crystallographic data from Rosalind Franklin. The diameter of the helix was found to be 20 Angstroms and there was a rotation of 36 degrees per base, thus having 10 bases every 360 degrees. The most innovative ideas of Watson and Crick's model was that the two chains were held together by bases of purines and pyrimidines. By hydrogen bonding, a purine must be bonded to a pyrimidine creating a complementary pair. Using experimental data that showed the ratios of adenine and thymine were very close as were guanine and cytosine they stated that adenine bonds to thymine and guanine binds to cytosine. They discovered this based on comparing the ratios of A-T, C-G and A-G, and they found that the first two ratios were the closest to 1 where as the second was varied. This helped them make the conclusion that A bonds with T and C bonds with G only. The pairs’ base lengths are equal, and fit exactly between the two chains of phosphates. The bonds between the two phosphate groups are hypothesized by Watson and Crick to be hydrogen bonds, which are easily broken. The DNA nucleotide must also contain deoxyribose and not ribose because the extra oxygen on ribose would interfere with the structure due to van der waals interactions. The discovery of DNA structure thus gave them a very good idea on how DNA might replicate itself, and thus the passing of genetic material. Also, they found that each of the bases was capable of tautomerizing between the enol and keto forms. Experimentally, it was determined that the keto form predominates at a physiological pH. Thus, they also came up with a method for demonstrating how DNA may denature as pH changes due to conversion from the keto to the enol form.

Later Discoveries


Watson and Crick’s discovery led to many new investigations, such as the structure of RNA, how DNA contains all the information for protein production, and the Human Genome Project, whereby all the 100,000 human genes are attempted to be mapped.

Although the discovery of the structure of DNA was attributed to Watson and Crick, a keynote player in helping them discover this structure was a scientist by the name of Rosalind Franklin. Rosalind Franklin, along with Francis Wilkins, worked on DNA applying X-ray crystallography to find out its structural properties. X-ray crystallography required the process of exposing a crystal specimen (DNA) to X-rays to determine the locations of the atoms in the “molecules that comprises basic unit of crystal called unit cell”. The task however was not an easy one to attain.

Obtaining a clear diffraction pattern of an object required that the crystal be pure and the x-ray strong enough. However, as Franklin realized, DNA existed in two forms in equilibrium which resulted in a very unclear diffraction pattern. These forms were the A form, which is the dehydrated form of DNA, and the B form, which generated a long and fibrous structure due to humid conditions. Franklin, applying her chemistry background, then proceeded to isolate these two forms using clever laboratory techniques such as “manipulation of the critical hydration of her specimens” [2]. The A and B forms were then separated and subjected to X-ray crystallography obtaining the pictures which would be the basis of Watson and Crick’s helical DNA structure; one of them the famous photo no. 51 of the B DNA form.

Fiber Diffraction


Fiber diffraction is a method used to determine the structural information of a molecule by using scattering data from X-rays. Rosalind Franklin used this technique in discovering structural information of DNA. The experiment places a fiber in the trajectory of an X-ray beam at right angle. The diffraction pattern is obtained in the films of a detector placed few centimeters away from the fiber. The fiber diffraction pattern is a two dimension patterns showing the helical symmetry of a molecule rather than a three dimension symmetry if taken by X-ray crystallography. A good fiber diffraction patterns exhibits four quadrant symmetry, the axis aligned to the fiber is called the meridian and the axis perpendicular to the fiber is called equator. Franklin obtained a diffraction pattern using a non-crystalline DNA fiber, and from it she deduced the B-form of DNA.


  • The diffraction pattern obtained by Franklin and Wilkins showed a X pattern which hinted of a 2 stranded helical form


  • They also observed that the patterns was consistent and inferred that the helix’s diameter must also be consistent,


  • The helical turn of DNA correlates to the horizontal lines in the picture which measures to 34 Angstroms. They also calculated that the gap between based pairs was 3.4 A as measured on the distance from the center of the X to the ends. Simple math deduced that there are 10 nucleotides per turn

Franklin and Wilkins also showed that the sugar phosphate backbones were found to be in the outside of the helix and not inside as it was previously thought to be. They came to this conclusion because of the A and B forms of DNA. The hydrated and dry forms of DNA showed that water could easily come in and bind to DNA, a fact that could only happen if the feature showed sugar phosphate backbones being on the outside.



- DNA’s helical structure was composed of two strands

- establish that DNA’s diameter was similar throughout

- calculated that 1 turn was 34 A, distance between base pairs as 3.4A, and 10 nucleotides per helical turn

- showed that sugar phosphate backbones were located outside of the structure



Berg, Jeremy. Biochemistry Textbook DNA Replication is required for all cell division, which allows organisms to grow. In DNA replication, the DNA is first divided into two daughter strands in the genome, which carries the exact genetic information as the original cell. This starting point of the strand being separated is called the "origin". The double strand structure of the DNA aids the mechanism in replicating; these two strands are first separated into two separate strands. The complementary stands of the two separate strands are then recreated by DNA polymerase, an enzyme that specialize in making complementary strands; it will find the correct complementary base for each strand and it will extend from the 5' to 3'. The process by which the original strand is being preserved is called "semiconservative replication". DNA replication is essential in the life cycles for biological organisms. It is initiated when the double stranded DNA located at the origin of replication is separated or melted. When the double stranded DNA is melted, melted region is propagated and a mature replication fork forms. DNA melting, along with the replication fork formation is coordinated by initiators, helicases, and other cellular factors. Recent advancements in structural biochemistry studies of initiators and replicative helicases have been emphasized in archaeal and eukaryotic cells. The results of these studied have provided new insight to possible mechanisms of the early stages of DNA replication.

Genomic DNA is a common, necessary, and essential process in all living things. Replication can be divided into initiation, elongation, and termination steps.

Initiation of DNA replication


During initiation, initiators recognize and then bind the replication origin DNA, converting it to a replication fork. The steps of initiation are made of up of the following steps: initiators assemble around the origin of DNA, and the dsDNA origin is melted. The melting of dsDNA produces a replication fork on each side of the origin to allow bi-directional replication. Before this step can happen, however, there are topological limitations that must be overcome to convert the melted origin to a fork structure. To induce the assembly of initiators at the origin, biochemical methods can be utilized to detect the initial melting of origin dsDNA. In the archaeal and eukaryotic cellular systems, the duration of origin melting is still unsure. However, the origin melting has been shown to be induced by the assembly of LTag. SV40 LTag is capable of inducing origin melting and unwinding, therefore it is considered to be the initiator[check spelling] in the eukaryotic system. It has been used as a model to study origin recognition, assembly, and melting process. To convert from a melted dsDNA origin, an assembly of initiators at an active replication fork expands the melted region and positions the helicase on the fork.

The initiation step is one of 3 steps in DNA replication (along with elongation and termination). In initiation, many replication proteins called initiators convert the DNA into a replication fork. This is accomplished first by the initiator proteins assembling around the DNA which causes melting of the dsDNA (double stranded DNA) origin. The origin melting then starts to produce a replication fork on each side of the melted origin. This produces bi-directional replication. Ring shaped helicases assists in this process. However, the mechanism of how the initiators and helicase melts and unwinds the origin DNA is not well understood due to the lack of high-resolution structures at the intermediate.

In eukaryotic and archaeal cellular system the initiator proteins includes Orc, Cdc6, Cdt1, and MCM (mini-chromosome maintenance) helicase. MCM is one of most important factors in the formation of the unwound fork. MCM forms hexamers that can dimerize into double- hexamers. The helicase for SV40 large T antigen (LTag) is able to recognize the origin DNA and can melt and unwind the DNA into a replication fork without the use of cofactors . SV40 LTag is considered the archetypal initiator/helicase in eukaryotic systems and is a model for studying recognition, assemble and melting.

Crystal structures of LTag hexamer reveals a channel of (13-17Å), which is enough for a ssDNA to go through but not dsDNA (20 Å). It is believed that melted ssDNA is encircled in the central channel for hexameric helicase, even during the assembly at the origin.

LTag also shows a β-hairpins in the central channel that is configured in a planar arrangement. β-hairpins form 2 adjacent planar rings with DR/F loops which contributes to the narrowest part of the channel in the AAA+ domain. It is questioned whether LTag can expand to accommodate dsDNA or is the dsDNA modified by initiator/helicase to fit the narrow channel. However for the latter to occur, LTag must squeeze and crush the dsDNA which disrupts the base pairs and melting of the dsDNA. This models often referred to as the squeeze to open model.

The most widely accepted model for fork unwinding is of the ring-shaped helicase that encircles and migrates down the DNA strand and splitting the dsDNA to ssDNA.

In Prokaryotic cells, bacterial replicases contain a polymerase, polymerase III (Pol III), a β2 factor, and a DnaX complex. They are very processive, and cycle faster during Okazaki fragment synthesis in many ways. DnaA (an origin recognition protein) can start the origin melting into single stranded DNA (ssDNA). The ssDNA is the site for loading hexameric helicase DnaB(which only exist as single-hexamers). One helicase that bacteria has is DnaB6, which can separate two strand at the replication fork. It translocates at the 5'-->3'. The DNA polymerase III holoenzyme (Pol III HE) makes contact at the replication fork and also function as a dimer that appears to have a regulated affinity on the lagging strand in order to recycles between primers during Okazaki fragments synthesis. DnaB uses ATP hydrolysis to go down the strand in order to split the two strand. Primase interacts with the helicase and combines with short RNA primers for Okazaki fragment synthesis. The RNA primers keep extending by the Pol III HE until a signal is received to replace to the next primer at the replication fork. During the process, the gaps between the Okazaki fragments are filled, RNA primers are deleted by DNA polymerase I, and is sealed by DNA ligase. DnaB has its N-terminal end free for docking primases making it easy for the primase to capture the ssDNA emerging from the N-terminal domain during fork unwinding.

Initiating Replication in Archae and Eukaryotes by Melting the Double Stranded DNA


Although not much is known about the initiation of replication by the melting of double-stranded DNA, recent studies have shed light on possible mechanisms for this process. Two co-crystal structures from archaea that have both the initiators and the origin DNA have been discovered to show how the initiators recognize the double-stranded origin of DNA. The complexes, Cdc6/Orc-dsDNA show the double stranded DNA deforming and bending, but not melting. Thus, researchers believe that in order to trigger the melting of the double-stranded DNA and to generate higher order complexes at the origin of replication, initiators like MCM mentioned in the above section must be needed.


This image represents an example of the structure of a DNA replication initiator—specifically showing the Cdc21 and Cdc54 (similar to the Cdc6 described above) N-terminal domain. The initiator, Cdc6/ORC 1 (which is not depicted here but can be represented by the picture above) binds to the origin of replication and bends the DNA. Citation: In eukaryotes, the SV40 Ltag at the origin is able to trigger the melting of the origin of replication and the subsequent unwinding of DNA, making it the initiator-helicase that is used as a model system for examining origin recognition, assembly, and the melting of the double-stranded DNA. The crystal structures of Ltag hexamers that are not bounded to DNA have been shown to have channels that seem to be able to bind to only single stranded DNA, but not double stranded DNA because the channels are usually about 13 to 17 Å (angstroms), while double stranded DNA molecules tend to have a diameter of about 20 Å, making a double stranded DNA molecule unable to fit inside the channel. Generally, studies of DNA translocation have shown that in order for a double stranded DNA to fit inside the channel of an Ltag hexamer, without changing its shape, the channels diameter must be at least 20 Å in diameter. In addition to not being big enough, crystal structures of Ltag hexamers have a planar arrangement of b-hairpins in the middle channel.


Here is an example of a b-hairpin, a component of the LTag hexamer structure. The b-strands in the b-hairpin are antiparallel—meaning that the N-terminus of one b-sheet is aligned with the C-terminus of another b-sheet. In the case of the LTag hexamer, the b-hairpins are on the same plane in the central region of the channel. Citation:

Recently, cryo-EM has demonstrated that Ltag hexamer channels can bind double stranded DNA molecules by surrounding the double stranded DNA with two hexamers. Researchers however, still are unsure whether the double stranded DNA changes configuration because of the initiator-helicase or whether the Ltag widens to allow the double-stranded DNA to bind. One model, the squeeze-to-open model, asserts that the Ltag hexamer can fit the origin double stranded DNA into its narrower channel by squeezing the DNA through the channel. As a result, base-pairs are disrupted and the melting of the double stranded DNA origin occurs. This model has been proposed, and is in the process of being confirmed because it appears to be consistent with the known data regarding DNA melting.

The Formation of the Replication Fork by the Squeeze-Pumping Model:

The squeeze-pumping model derives from information that comes from the structure of the Ltag hexamers. The structure includes a narrow channel as mentioned above, an AAA+ motor domain, a side channel where single stranded DNA can exit, and inter-Zn domains. This model is based on the DNA being melted by the squeeze-to-open model described above, where the melted DNA is pumped to the Zn-Domain until it generates the single stranded DNA loop which can then leave the channel and form the replication fork.

Translocation of Single and Partially Hydrolyzed Double Stranded DNA: Researchers have demonstrated that double-hexameric LTag and MCM have the ability to unwind DNA. LTag has been shown to be able to unwind long double stranded DNA that include an internal origin sequence in its double-hexameric form. This differs from the steric exclusion model of fork unwinding—which is the most widely accepted model. This model is based on evidence showing that a ring-shaped helicase surrounds and moves down one of the DNA strands toward the double-stranded DNA fork while exposing the single stranded DNA strand in the process.


The above image represents a ring-shaped hexameric helicase structure that surrounds and moves down the DNA strands (which are not depicted in the photo). Citation: ;


Curr Opin Struct Biol. 2010 Sep 24. [Epub ahead of print] Origin DNA melting and unwinding in DNA replication. Gai D, Chang YP, Chen XS. Molecular and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA.


For decades, individual studies were done on DNA replication and protein synthesis. Not many scientists discuss the link between these two critical processes in living organisms. Jonathan Berthon, Ryosuke Fujikane, and Patrick Forterre came together in their article “When DNA replication and protein synthesis come together” to provide a detailed explanation of the connections of these seemingly independent fields of structural biochemistry. They suggest that the unexpected but real connections between DNA replication and protein synthesis are found in the three domains of life, especially in Archaea and Eukarya. They believe that there are mechanisms that couple DNA and protein synthesis. Such mechanisms can be found in the activities of (p)ppGpp – Guanosine polyphosphate derivative – and GTPases or the Obg family.

Stringent response is a phenomenon that can well link the processes of DNA replication in bacteria’s to the change in amino acid concentration in proteins. As starvation of amino acids occur, a dramatic increase in the intracellular (p)ppGpp concentration is observed that initiates the shut-down of rRNA gene transcription as well as protein synthesis. This process, however, varies among different bodies of bacteria. For instant, inside the system of Bacillus subtilis, amino acid starvation, along with the inhibition of rRNA gene transcription, blocks the elongation step of DNA replication. (p)ppGpp also inhibits the DnaG primase in Bacillus subtilis and could directly affect the Okazaki fragment synthesis in the lagging DNA strand, during the process of self-replication. On the other hand, stringent response in Escherichia coli leads to an instant interference of the DNA replication initiation. Such proofs are important in proving the direct connection between proteins and DNA replication process. The starvation of protein’s amino acid has the potential to stop DNA replication.
Another source of connection is Obg family. Obg is known for its ability to couple ribosome biogenesis, a critical step in production of proteins as protein synthesis is done inside ribosome through mRNA, with DNA replication. The link between ribosome biogenesis and DNA replication is argued by scientists to start from the proteins that are originally function in the making of ribosome. These proteins participate in the regulation of the stringent response in bacteria as well as in the stabilization of DNA replication forks. A type of Obg, called ObgE is useful in controlling the levels of (p)ppGpp. One important link between DNA replication and protein synthesis found in ObgE is the fact that the depletion of ObgE would cause problems in chromosomes segregation and cell separation. This study is significant in showing that changes in certain proteins within the body would directly affect the pattern of DNA replication and the organism’s genetic processing. For this reason, Obg studies were done to prove the direct role that this type of protein plays in connecting DNA replication and protein synthesis.
Similarly, a type of protein family called NOG1 – Nucleolar G-protein – also participates in the making of ribosome. Nog1p from this particular family belongs to a complex that contains many other proteins that directly take part in DNA replication such as Orc6p (origin recognition complex), Mcm6p, some subunits of MCM complex, Yph1p, and Rrb1p. A very important statement was made by Kilian that changes in proteins that connect ribosome biogenesis to DNA replication would cause “chromosome instability” and “tumor formation”. He also concludes that there exists a network of proteins that directly link the production of ribosome’s and DNA replication in Eukarya domain.
All of the above studies and conclusions apply only for Eukarya because there is no clear evidence found for the domain of Archaea. Scientists, however, found that there is a cluster of genes that encode both DNA replication and translation proteins. This cluster includes numerous genes including essential ones such as aIF-2, an excellent source for regulation of DNA replication and protein synthesis. eIF-2 phosphorylation from this cluster is a major component in the mechanism of protein synthesis in eukaryotic cells. Another important component is Nop10 – plays a part in rRNA development. From simply examining these components, a clear conclusion can be drawn that there is, indeed, a close relation in the studies of proteins and DNA replication. One important example is the phenomenon where the two ribosomal proteins L44E and S27E interferes with the DNA replication process under special conditions such as amino acid starvation, previously discussed in the case of stringent response.
In conclusion, in both Archaea and Eukarya, there are many experimental data that confirm or suggest the close connections between protein synthesis and DNA replication. Stringent response is one example of how starving amino acids would inhibit the process of DNA replication initiation.
DNA Replication Fork


The DNA Replication process works in an "assembly line" like fashion. The DNA double helix is ripped apart and a copy of each strand is produced. There are many biological enzymes that take part and must be present for this vital action to occur correctly.

Biological Proteins and Enzymes Required for DNA Replication (in chronological order)


Replication Fork


When DNA is being replicated, it forms a replication fork that was created during the helicase process that separates the DNA strand. The strands that are separated are called the leading strand and the lagging strand accordingly. The leading strand is synthesized in the 5'-3' direction. It is the new DNA strand, which is being synethized by DNA polymerase. The lagging strand, on the other hand, at the opposite side, which runs from 3' to 5' direction and are synthesized by okazaki fragments. Then primase will build up RNA primers, allowing DNA polymerases to use the 3' OH groups on the RNA primers to act on the DNA running from 5' to 3'. Then these RNA fragments are being substituted with new deoxyribonucleotides and the strand will then be joined together with DNA ligase to complete the chain.

As the DNA unwinds, it will automatically force the DNA to rotate, twisting the structure. This is actually a problem to replicating DNA because it will eventually be physically incapable of replicating when it is over-twisted. To solve this problem, a enzyme called DNA topoisomerases is used. Topoisomerases I will cut the backbone of the DNA to allow the DNA to unwind itself and topoisomerases II will cut the backbones of both strands to allow interconnections with other DNA molecules, eliminating the chances of tangling together.



Helicases are motor proteins that move along the double-stranded nucleic acids and actively unwind the double helix. The enzyme uses the energy produced by the hydrolysis of ATP to ADP to unwind and separate a strand of DNA. This is done by the breaking of the hydrogen bonds between the annealed nucleotide bases. Helicase opening of the double strand can be categorized into two different cases: active opening and passive opening. In the active opening case, helicase directly destabilizes the double strand nucleic acid (dsNA) to promote the separation of the two strands. In the case of passive opening, the helicase enzyme binds to a single strand nucleic acid (ssNA) that existed due to thermal fluctuation which induces the opening of part of the double strand. It is found that active opening can increase the rate of unwinding of the DNA strand by 7 folds compared to passive opening. The product of this action is two template strands. One is known as the Leading Strand and the other is known as the Lagging Strand.

The leading stand is the single strand of the parental DNA that is synthesized continuously without interruption while the lagging strand of the parental DNA is formed in fragments. These fragments are called the Okazaki fragments. This is important in explaining how both strands of the parental DNA forms in a 5'->3' direction despite the fact that the two strands are antiparallel. The fragmentary synthesis enables the 5'->3' growth while appearing to form in a 3'->5' direction.

Single-Stranded DNA Binding Proteins


The Single-Stranded DNA Binding Proteins bind to the DNA templates in a way that ceases the two newly formed strands from reannealing. these proteins keep the strands separated so that both of the strands can serve as templates for replication. This allows the remainder of the replication machinery to get into position and begin making new DNA strands.

DNA Polymerase


(see DNA Polymerase Section)

RNA Primase


The RNA Primase attaches itself to the Lagging Strand in a position adjacent to the Helicase. The RNA Primase's Function in DNA Replication is to lay down RNA Primers in 3' to 5' fashion. These RNA Primers act as starting and ending locations for the DNA Polymerases addition of complementary nucleotides. The nucleotide sequences between RNA Primers are known as Okazaki Fragments. The RNA Primase is only necessary in the Lagging Strand because DNA Polymerase can only add complementary bases in a 5' to 3' direction, and the lagging strand is being unwound in the 3' to 5' direction.

  DNA Replicases from a Bacterial Perspective

Mitochondrial DNA Replication

Human Mitochondrion Genome

Mitochondrial DNA (mtDNA) is maintained apart from nuclear DNA. Because of mtDNA’s small size, it can only boast 37 genes and 13 protein products whereas the haploid nuclear genome encodes over 20,000 genes. However, it can provide a model system for studying nuclear DNA replication. The genome for the circular mtDNA contains approximately 16,600 base pairs in human beings. The encoded genes are also found to be necessary for making ATP by way of oxidative phosphorylation. There seems to be no specific phase for mtDNA to be replicated, meaning the replication can take place over and over during a cell cycle.

The endosymbiont hypothesis is the idea that mitochondria were engulfed to create the first eukaryote. Evidence supporting this hypothesis comes from the existence of mtDNA itself. Because mitochondria were once free-living bacteria, it might be anticipated that the mechanics of mtDNA maintenance would show greater similarity to prokaryotes over eukaryotes.

The mechanism in which mtDNA replicated was discovered in 1972 by electron microscopy. All replicating mtDNA molecules had a single-stranded branch. This further resulted in the leading-strand and lagging-strand synthesis uncoupled in mitochondria, which was different compared to the replication fork for nuclear DNA. The human mtDNA is typically arranged in covalently closed circles that are about one genome in length. In mtDNA replication, there is a strand-displacement replication fork in which leading-strand DNA synthesis occurs in the absence of lagging-strand DNA synthesis. DNA synthesis is carried out by conventional coupled leading- and lagging-strand. Then, delayed lagging-strand DNA synthesis is accompanied by incorporation of RNA on the lagging strand termed RITOLS for RNA incorporation throughout the lagging strand.

The issue of how mammals replicate their mtDNA gave rise to mtDNA replication redux which is an attempt to test the idea that biased segregation of human pathological mtDNA variants was related to replicative advantage, as suggested for yeast mtDNA. 2D agarose gel electrophoresis (2D-AGE) was used to resolve replication intermediates from mitochondria. This was used to define details of the mechanisms of replication for nuclear, plasmid, and viral genomes. It was found that many replication intermediates from crude mitochondrial preparations were sensitive to single-strand nuclease as predicted by SDM, a subset formed arcs indistinguishable from those associated with replicating nuclear and prokaryotic DNA.

However, purer preparations of mitochondria yielded not partially single-stranded DNA but RNA/DNA hybrids. This concluded that the SDM intermediates from earlier studies could be explained by RNA loss during isolation and processing.

In conclusion, there is still controversy about the mechanisms of mtDNA replication. The strand-displacement model of mtDNA replication is where there is a minimum of two primer maturation events for each strand, which applies to the RITOLS replication as well. The identification of Dna2 and Fen1 in mitochondria provides new tools for studying mtDNA replication. By manipulating their expression and studying mutant variants that disrupt mtDNA replication, it might prove to be very informative.

It was found that mutations, deletions, and other problematic arrangements of mtDNA increased in correlation to a mammal's aging. The accumulation of mutated mtDNA in single cells cause respiratory chain deficiency. This causes shorter life spans for the mammals. It can also cause aging phenotypes when there are many mutations such as weight loss and loss of hair.[1]


  1. DNA Replication and Transcription in Mammalian Mitochondria.

Holt, Ian J. "Mitochondrial DNA replication and repair: all a flap."

  • Berthon, Jonathan, Ryosuke Fujikane, and Patrick Forterre. “When DNA replication and protein synthesis come together”. Trends in Biochemical Sciences. Vol.34, no.9 (2009): 429-434. Cell Press.
  • Dahai Gai, Y Paul Chang and Xiaojiang Chen. "Origin DNA melting and unwinding in DNA Replication." Current Opinion in Structural Biology 2010, 20:1-7. Elsevier.
  • Charles S. McHenry. "DNA Replicases from a Bacterial Perspective." Annual Review of Biochemistry Volume 40 2011 July, 403-36.

General information


The full process of DNA replication is comprised of the intricate and coordinated interplay of more than 20 proteins. In 1958, Arthur Kornberg and his colleagues separated DNA polymerase from E.Coli. DNA polymerase is the first known of the enzymes whose function is to promote the bond formation of the joining units that make up the DNA backbone. E.Coli has various numbers of DNA polymerases, assigned by Roman numerals, that play important roles in DNA replication and repair.

DNA polymerase is an enzyme. This enzyme synthesizes a new DNA strand from an old DNA template and also works to repair the DNA in order to avoid mutations. DNA polymerase catalyzes the formation of the phosphodiester bond which makes up the backbone of DNA molecules. It uses a magnesium ion in catalytic activity to balance the charge from the phosphate group.

Nucleotides are added to only the 3' end of the new strand; it is impossible for it to start a new chain on its own. Another DNA polymerase function is error correction - the correction of mistakes that were made in the new DNA strand. The entire DNA polymerase family consists of 7 different subgroups: A, B, C, D, X, Y and RT. Eukaryotes have at least 15 different DNA polymerases. However, none of the eukaryotic polymerases can remove primers, and only the elongation polymerases can proofread the sequence.

Although there are different types of DNA polymerases, all have common structural features. Additionally, even though DNA polymerases differ greatly in detail, they have very similar overall shape. There are at least 5 structural classes of DNA polymerase that have been identified. They take the shape of a hand with specific regions referred to as the fingers, the palm, and the thumb. In all classes of DNA polymerase, the thumb and finger wraps the DNA, holding it across the active site of the enzyme, while the palm releases residues that comprise this active site. Moreover, all DNA polymerases use similar strategies in the catalyzation of the reaction.

Diagram of DNA polymerase extending and proofreading a DNA polymerase

General Formulation


DNA polymerases are the catalysts in the step-by-step addition of deoxyribonucleotide units to a DNA chain. The reaction catalyzed is

(DNA)n + dNTP ↔ (DNA)n+1 + PPi

where dNTP stands for any deoxyribonucleotide and PPi is a pyrophosphate ion.



1. All four activated precursors are needed for the reaction to occur, the deoxynucleotide 5’-triphosphate dATP, dGTP, dCTP, and dTTP, in addition to Mg2+ ions. Typically, two of the metal ions will take part in the reaction. One will interact with the primer while the other with dNTP. The carboxylate groups of the residues in dNTP bind the two metal ions in place.

2. The new DNA chain is constructed directly on a pre-existing DNA template. DNA polymerases can only work efficiently as a catalyst in the formation of phosphodiester bonds if the base on the incoming nucleotide triphosphate is complementary to that of the template strand. In other words, DNA polymerase is an enzyme that synthesizes a product by interpreting the existing DNA strand as a template and produces the complementary sequence of the template into a new strand.

3. DNA polymerases necessitate the presence of a primer to start synthesis. The reaction catalyzed by DNA polymerases that works to elongate the chain is a nucleophilic attack by the 3’OH terminus of the growing chain on the innermost phosphorus atom of the deoxynucleotide triphosphate. Therefore, a primer strand with a free 3'-OH group must be bound to the template strand from the start. This primer is formed from RNA synthesis. Due to the fact that RNA can form without a primer, it starts the synthesis of DNA. Once the complementary DNA is formed and the synthesis has been initiated, the RNA piece will be removed and then replaced by the proper DNA sequence. A phosphodiester bridge is formed from the reaction and pyrophosphate is released. The ensuing hydrolysis of pyrophosphate that results in the creation of two ions of orthophosphate (Pi) by pyrophosphate assists to drive the polymerization forward. This elongation process of the DNA chain proceeds in the 5’-to- 3’ direction.

4. Many DNA polymerases are able to remove the mismatched nucleotides as a method of mistake correction in DNA. The polymerases possess a distinct nuclease activity that allows them to eliminate incorrect bases through a separate reaction. DNA polymerase will reverse its direction by one base pair and excise the incorrect base to replace it with the proper one and continue with the rest of replication. Due to this 3' to 5' exonuclease activity, DNA replication has a remarkably high dependability. This step process is also called proofreading. However, it is not completely perfect, which is why natural mutations and related diseases can still arise.

Eukaryotic DNA Polymerases


DNA Polymerases play a key role in the synthesis of DNA. Without these players, life would cease to exist. These polymerases are multi-subunit complexes that function very uniquely. It requires different components to work together to function efficiently. Polymerases act upon single-stranded strands (specifically to the template), to synthesize a strand that is complementary. In eukaryotic cells, there are 5 families of DNA polymerase. These can encode into different (up to as many as 15) enzymes. Critical for DNA replication are three DNA polymerases: Polymerase α-primase, Polymerase δ, and Polymerase ε. These three polymerases function at the replication fork of the DNA strands. The DNA strands are unwounded by MCM helicase, which is part of a CMG complex (Cdc45-MCM-GINS). It is Polymerase α- primase that initiates replication on the leading and lagging strand. It is here that the RNA primers (about 10 nucleotides) are laid down.

After the initiation, Polymerase δ and ε are brought to the complex and tethered. They function to increase the productivity of the different enzymes. Specifically, Pol δ synthesizes on the lagging strand while the Pol ε synthesizes on the leading strand. The roles of these polymerases were found by genetic experiments. For Pol ε, a mutation was placed on the active site. This increased the rate of enzymes activity, and leave behind a signature in the regions of activity. With the involvement of reporter genes, it proved that the Pol ε did indeed participate in the synthesis of the leading strand. The same genetics were done for Pol δ to prove its activity with the lagging strand.

A consistent correctness is necessary with the implementation of the bases. An incorporation fortunately occurs only every 10,000 replicated base pairs. But when it does occur in the DNA primer strand, it must be moved out from the polymerase and to the exonuclease domain. It is there that it is proofread and allow for continuation of a stable domain. [1]

Polymerase Families


Central to life, polymerases have been put under study in search of its structure as well as roles. To date, there have been 7 different families (or domains). There are 5 unique to eukaryotic cells. More families are unique to bacteria and archaea. In these polymerase families, there is a core structure: palm, finger, and thumb domains. From there the families diverge to their specific cellular functions. The 7 families are labeled with letters: A, B, C, D, X, Y, and reverse transcriptase. Family A includes Pol I polymerase, which functions to repair nucleotides. It also includes Okazaki fragments, which takes part in the replication of the lagging strand. Family B includes the eukaryotic polymerase sigma, alpha, as well as epsilon. Family C harbors the Polymerase III, which XXX. Family D includes polymerases that are exclusive to archaea. Family X as well as Y include enzymes that do repairing.

Schematic summary of the compositions of DNA Eukaryotic Polymerases. [a] Polymerase α. [b] Polymerase ε and [c] Polymerase δ. The common core structure can be viewed with the larger catalytic subunits. Then each polymerases have their own unique smaller subunits that allow them to function in their own specific ways

Within the Eukaryotic DNA Polymerase Structures


As it was earlier noted that the polymerases are multi-unit entities, it holds true that they are very complex. The structures are comprised of a large catalytic subunit (part of the B family), and then many other smaller subunits. The architecture of the B family polymerases are consistent: a N-terminal domain, 3’-5’ exonuclease domain, palm, finger, and thumb domains; in a ring-like structure. The catalytic subunit of all the eukaryotic polymerases are assumed to be related and come from a common ancestor via gene duplication. But studies do show that the catalytic subunit of the ε is larger than the other two due to additional sequences.

Obtaining structures that are in high resolution is essential for further analysis of polymerases. To date, there has been a lot of progress in formulating the structures of the different subunits that make up the polymerases, but only at low-resolutions. The first structure reported was the cryo-EM structure of the Pol ε. Researchers aim to work towards high resolution structures because it came allow further understanding of the fidelity of DNA synthesis, and the highly regulated genome that is maintained in all of the eukaryotic cells. Furthermore, it would allow design of genetic experiments to explore the interactions of and within the complexes.[2]



"Molecular Recognition and Catalysis in Translation Termination Complexes" by Bruno P. Klaholz. IGBMC (Institute of Genetics and of Molecular and Cellular Biology), Department of Structural Biology and Genomics, Illkirch, F-67404 France. Trends in Biochemical Sciences, May 2011, Vol 36, No. 5

"Crystal Structure and Functional Analysis of the Eukaryotes Class" Mol. Cell 14, 233-245. Kong, C. etal (2004) DNA initiation is the first stage of the DNA replication process. During this stage, the double stranded DNA (dsDNA) is first separated into single strands by breaking up the hydrogen bonds between base pairs. The separation of dsDNA into singled stranded DNA (ssDNA) is known as DNA melting. Proteins that are responsible for breaking up of dsDNA are called initiator proteins. In the next step, proteins called helicase bind to the dsDNA and unwind it to create a replication fork. In Eukaryotic and archaeal cells, melting and unwinding of DNA are mainly accomplished by mini-chromosome maintenance helicase (MGM) along with multiple initiation proteins. However, helicase such as large T antigen (LTag) and E1 which are found in simian virus 40 (SV40) and bovine paillomavirus (BPV) are able to break up and unwind the dsDNA without any additional cofactors (Chen et al.). Because of the great similarity between the viral and eukaryotic and archaeal DNA replication system, LTag and E1 are studied intensively by researchers in hope of gaining a better understanding in replication process.



Differences in the arrangement of β-hairpins and mode of ATP binding etc. in the viral proteins can lead to different mechanism of melting and unwinding. For example, LTag of SV40 is believed to use mechanism that follows the double-pump looping model which is described in Chen et al. First, LTag in the shape of a double-hexamer binds to the dsDNA at the replication origin and compress it to break up the hydrogen forces between two strands. Two hexamers ahead of the replication origin then pump the dsDNA into the double-hexamer to create a replication fork that is consisted of ssDNA as loops. Further pumping of the dsDNA will elongate the replication fork to allow fork progression. On the other hand, E1 of BPV uses mechanism that follows closely with the steric exclusion model (Chen et al.). In this model, E helicase exists only as a single hexamer and is separated into two trimers with each binds to one strand of dsDNA at the origin to induce melting. After successfully breaking up the dsDNA, two trimers rejoin to form a hexamer that binds to only one strand of dsDNA and unwinds it to create a replication fork.



Although they present plausible mechanisms for DNA melting and formation of replication fork, both models still require support of further evidences. Questions such as how LTag binds to dsDNA or whether the E1 hexamer can separate into two trimers still remain unanswered. More intensive investigation and research are therefore needed.



Chen, Xiaojiang S, Paul Chang and Dahai Gai. "Origin DNA melting and unwinding in DNA replication". Current Opinion in Structural Biology 2010, 20:1-7.

Meselson – Stahl Experiment

Meselson and Stahl Experiment

Theories of Replication of DNA



The daughter DNA is composed entirely of new DNA and the parent DNA retains it’s same back-bones and bases.


Replication produces two copies of DNA that are made up of 50% DNA from the parent DNA helix, and 50% of new DNA. In this situation, each daughter DNA double-helix contains one strand that is the old DNA (from the parent) and one strand that is new (the complimentary strand resulting from the replication).


This form of replication also produces daughter DNA that is constituted by 50% new DNA, and 50% parent DNA. However, in this case, the new DNA and old DNA are shuffled, and fragments of each are found on both strands on the helices on both copies of DNA following replication.


Schematic of the three theories of replication, by CJHIGGIN

The Experiment


Watson and Crick proposed that DNA replicated semi-conservatively, but conservative and dispersive replication were still plausible until the theories could be disproved. In 1957, Matthew Meselson, and Franklin Stahl devised an experiment to determine whether DNA replicated following a conservative, semi-conservative, or dispersive model.


Meselson and Stahl cultured Escherichia Coli in a medium containing a heavy isotope of nitrogen (15N) as the only nitrogen source, as opposed to the more common nitrogen-14 (14N). After several generations, the E. coli contained DNA composed of nucleotide base made of 15N isotope. The (15N) DNA was denser than the common (14N) DNA, and the difference in densities allowed for separation by density gradient equilibrium sedimentation.

To achieve separation of the E. coli DNA by densities, the DNA was mixed with a solution of CsCl and centrifuged. A CsCl density gradient was created as a result of sedimentation and diffusion working against each other. The DNA molecules were found in the area of the CsCl density gradient that was equal to their own density.

The (15N) E. colicells were transferred to a medium that contained only (14N). DNA was isolated from the first generation of cells grown in the (14N) medium, and analyzed by density gradient equilibrium sedimentation. Then DNA from the second generation of E. coli grown in the (14N) medium was extracted and analyzed.



The first generation of E. coli grown in the 14N medium contained a single DNA band found halfway in between where the 14N DNA band and the 15N DNA band should have been. This demonstrated the presence of a DNA that was lighter than the DNA from the original population of E. Coli grown in the 15N medium, but still heavier than 14N DNA. Due to the position of this intermediate DNA band in the density gradient, it was apparent that the DNA was a hybrid and contained both 14N and 15N. This automatically eliminated the conservative model of replication, which would have resulted in two distinct bands: one matching the position of the 15N-containing DNA, and one matching the position expected by DNA containing only 14N. Only the dispersive and semi-conservative models fit the situation.

The second generation of E. coli grown in the 14N medium contained two distinct bands. One of the bands was 14N DNA, and the other band was the intermediate (14N/15N) DNA. This result supported the theory of semiconservative replication since dispersive replication would have resulted in a single band of lower density DNA in each consecutive generation.

The figure below illustrates the theoretical outcome of the conservative, dispersive, and semiconservative models along with the experimental outcome obtained by Meselson and Stahl.


Figure: A schematic of the appearance of fractions of DNA samples after centrifuging in a density gradient, by CJHIGGIN



Berg, Jeremy M., John L. Tymoczko, and Lubert Stryer. "Exploring Genes and Genomes." Biochemistry. New York: W. H. Freeman, 2007. 113-14. Print.

Campbell and Reese's Biology, 7th Edition

Nelson and Cox's Lehninger Principles of Biochemistry, 5th Edition

General Information


A knockout mouse is a mouse used by researchers for laboratory experiments aimed at understanding the consequences of inactivation or "knocking out" of a specific gene. In general, the over- and/or under-expression of genes in an organism for experimentation is known as transgenic technology [1]. This process is completed by disrupting or replacing the existing gene with an artificial piece of DNA that is a mutated version of the targeted gene [2]. Due to the disruption, there will be a loss of gene activity, and it will cause changes in the mouse's phenotype. When the mouse's phenotype is affected, the changes in appearance, behavior, and other physical characteristics should be evident in the offspring.

Purpose and Applications


As many genes are similar in mice and humans, the extraction or "knocking out" of a particular gene in a mouse can provide evidence to further understand the extent of the function of genes in humans [3]. This usually is manifested by a change in the animal's physical characteristics, behavioral characteristics, or biochemical pathways that regulate the mouse's functions [4]. This laboratory technique has been used in various types of research:

  • Cancer Research
  • Cystic Fibrosis
  • Lung, Heart, Blood, and Parkinson Diseases
  • Aging
  • Anxiety
  • Arthritis
  • Diabetes
  • Obesity
  • Neural Pathway Functions
  • Substance Abuse

A specific gene studied from the knockout mouse can also be useful in studying how different recreational drugs affect the animal, which can be used to test therapies for drug abuse in humans [5]. For example, a p53 knockout mouse focuses on a mechanism - p53 - that codes for a protein that inhibits the growth of tumors and stops cell division. By taking out this gene, the mouse is at risk of developing various types of cancer (blood, lung, brain, bone, etc.). This is a useful study because humans with the abnormality in this gene have Li-Fraumeni Syndrome, a rare autosomal dominant hereditary disorder that puts people at a much higher risk of developing cancer.

Limitations and Weaknesses


Although knockout mouse technology is an excellent research tool, there are frequent complications that occur when a particular gene is knocked out. For example, the mouse might depend on the gene of interest for other important bodily functions; if it was disrupted, the mouse might die or stop functioning correctly in unexpected ways. In addition, the gene that is knocked out in the mouse may not even produce an observable change in any of the mouse's characteristics. This makes it extremely difficult to correlate the study with that of humans. Gene knockouts in mice embryos may sometimes inhibit the mice from growing into adult mice. This makes the studies limited to the pre-natal stage of the mouse, further distancing the relationship between the gene-knockout in the mouse, and that of humans.

Methods of Preparing Knockout Mice

Knockout Mouse Breeding Scheme

Knockout mice are created from embryonic stem cell (ES cells) by harvesting them approximately 4 days after fertilization. The reason for using the ES cells so early on is because the swapping of gene sequences can be properly passed on to the rest of the cells during division and develop along with the all the other adult cells. This process is completed in one of two methods:

Gene Targeting


In gene targeting, a particular gene is manipulated within the nucleus of the ES cells of the mouse through homologous recombination [6]. To start the homologous recombination, the DNA sequence of the gene that needs to be replaced would need to be known. Next is to make a new DNA sequence that is needed to be inserted into a chromosome. That chromosome is going to take the place of a wild-type allele. The artificial inactive DNA sequence is introduced (this piece is nearly identical in sequence to the knocked out gene). This artificial sequence flanks the DNA sequence in both directions on the chromosome. The cell recognizes the identical stretches of DNA, and "trades" the existing gene with the artificial DNA. Since the artificial DNA is inactive, the function of the existing gene has now been "knocked out" by gene targeting. The new cells will keep growing and dividing with the new gene inside of it.

Example: An embryo from a mice in the blastocyst stage of a species with gray fur is isolated. Then the embryonic stem cells are removed from the blastocyst and put on a tissue culture to be grown. Transfer the homologous recombinant gene and grow them in gancyclvir and neomycin. The cells with the new genes for white fur are then transferred back to the blastocyst. Many of the transformed blastocysts will be implanted into pseudo-pregnant mouse with white fur. The mouse will give birth to some white mice and some with patches of gray, showing the activity of the old gene. The mice with the patches - which means they have both the gray fur and white fur genes - will mate with a white mouse. If the gametes of the gray white mouse were from the recombinant stem cells then it will give birth to all gray mice. All of the cells in the mice are heterozygous for the fur gene. The gray mice will then mate together with the heterozygous mice. Identify which mice has the homozygous recombinant and mate them until both of the alleles are knocked out. The end result is the knock out mouse which is when both of the alleles have been knocked out.[7]

Gene Trapping


Gene trapping is done by using a sequence of artificial DNA which holds a "reporter gene" that is made to insert into any gene at random. The artificial DNA prevents RNA splicing in the cell, thus preventing the existing gene from synthesizing its assigned protein and eliminating its function. Now the activity of the artificial "reporter gene" can be observed and studied, to determine the existing gene's normal function in the mouse.

Which method is better?


For both of these methods, a DNA vector is used to carry the artificial DNA into the embryonic stem cells of the mice. Once the DNA is injected, the cells are cultured in-vitro, and then injected into mouse embryos. These embryos are given planted into female mice, which then give birth to mice with the knocked out genes.

Both ways have their own advantages. For example, in gene targeting, the target gene is known in the DNA sequence. This method allows researchers to knock out the sequence(s) that they find are interesting. On the other hand, although the specific gene which is knocked out is unknown in gene trapping, it would create different kinds of mice because there is no efficiency or precision in how the "reporter gene" binds; finding the function of specific gene can become cumbersome because of the randomness. The researchers need to spend a lot of time testing the ES cell to identify which gene has been knocked out. Moreover, a certain gene that is not easily chosen may be knocked out in random manner.


  1. Wikipedia: Knockout Mouse [41]
  2. National Genome Human Research Institute: Knockout Mice [42]


  1. "What is transgenic technology". Knockout Mouse and Transgenic Research. Retrieved 2009-11-14.
  2. Twyman, Richard. "Knockout Mice". The Human Genome. Retrieved 2009-11-14.
  3. "NIH Knockout Mouse Project". National Institute of Health. Retrieved 2009-11-14.
  4. "Knockout Mice". National Human Genome Research Institute. Retrieved 2009-11-14.
  5. Berg, Jeremy (2006). Biochemistry (6th Ed. ed.). W. H. Freeman. ISBN 0716787245. {{cite book}}: |edition= has extra text (help)
  6. Twyman, Richard. "Knockout Mice". The Human Genome. Retrieved 2009-11-14.
  7. Campbell, A. Malcom. "Homologous Recombination and Knockout Mouse". Davidson College. Retrieved 2009-11-18.

Transgenic animal are animals that have had foreign genes from another animal introduced into their genome. A foreign gene (such as a hormone or blood protein) is cloned and injected into the nuclei of another animal’s in vitro fertilized egg. Cells are then able to integrate with the transgene, and the foreign gene is expressed, upon which the developing embryo is surgically implanted in a surrogate mother. The result of this process, if the embryo develops, is a transgenic animal housing a particular gene from another species.

Applications of transgenic technology are for example, improving upon livestock, such as higher quality wool in sheep, or increasing the amount of muscle mass of an animal so that it can produce more meat for consumption. Conversely, transgenic animals can also be utilized for medical purposes such as producing human proteins by inserting a desired transgene into the genome of an animal in a manner that causes the target protein to be expressed in the milk of the trangenic animal.

Another example is one that involved mice. Normal mice have the capability to not be infected by the polio virus. They do not have the cell surface molecule that is required as a receptor for polio, unlike humans, who do have this receptor. However, the polio receptor gene can be injected into a mouse, thus developing a transgenic mouse. This allows the mouse to now be successfully infected by the polio virus, and display the similar symptoms that are displayed in humans who are affected by the polio virus.

The most common studies that are currently going on with transgenic animals involve animals, such as the rhesus macaques. These animals contain the human gene of the Huntington’s disease. This allows scientists to research options that can provide a cure to Huntington’s or at least a better treatment option. Other animals, such as mice or those that contain human stem cells, are used to create medicine and treatment options for diabetes, strokes, and blindness.

The human genome project has also been of great help in the role of transgenic animals. With the newfound discovery of the DNA sequence of the human genome, scientists can now study the genes that are involved as drug targets, which can help provide them with the ability to mark the appropriate gene that can aid in providing the cure to any certain disease that they are studying.

The expression of a transgene can also be engineered to take place in plants, such as obtaining the bio-luminescent gene that gives fireflies their glow in the dark ability, and introducing it to a plant.

Transgenic Animals Countless Benefits to Humanity


Three of the most widely-used reasons for producing transgenic animals for the benefit of human welfare are agriculture, medical and industrial.

Agricultural Applications


Farmers have always wanted to have the best breed for any type of animal and to have the best traits that it can possible have. The normal way of breeding animals can potentially take up a lot of time and is not entirely efficient. With new advances in technology, selected characteristics can be developed in species with a lot less time and more accuracy.

Not only are animals produced more efficiently, the quality of the animals are enhanced as well. Some examples include having cows create milk with lessened milk content and sheep that produced a lot more wool.

Also, with these new qualities in animals they must be protected. Scientists are researching on creating animals that are resistant to particular diseases and to enhance the two reasons stated above.

Medical Applications


Animals that have their genes modified to show disease symptoms, may be studied and cure could possibly be contrived in the near future.At Harvard, scientists created a transgenic mouse also known as OncoMouse® or also known as the Harvard mouse which allows it to carry genes that can enhance the development of a variety of cancers that are found in humans.

Xenotransplantation will play a major role in the medical industry in the future. It is the transplantation of living cells, tissues or organs from one species to another. Due to worldwide shortages of organs, advances in gene manipulation of animals can alter their organs to become susceptible to humans. For example, Transgenic pigs may play a major role in the transplant of organs to humans. Because Pig and human organs are closely related, there is a possibility to use pig organs for transplants. However currently, a pig protein inhibits the human body’s immune system acceptance of the organs. If animals such as Pigs can have its protein successfully supplanted by a human protein can be used to meet a major need- transplant organs such as the hearts, liver, or kidneys. It can also be applied to bringing about refined drugs in the pharmacy industry and nutritional supplements. An example is insulin and anti-clotting factors of blood can soon be extracted from milk of transgenic animals such as goats, sheep, cows. This milk being the source of importance is undergoing major research to create a type that will be able to treat diseases such as phenylketonuria or cystic fibrosis.

Human Gene Therapy, [33]

Human gene therapy is another medical application that is gaining wide acclaim. In essence, it is the transfer of genetic information into patient tissues and organs. As a result, diseased defective copies of genes can be eliminated or their normal functions rescued. Moreover, the procedure can provide new functions to cells. For Example, to combat cancer and other diseases, the insertion of a gene that causes the production of immune system mediator proteins can be introduced. By this therapy, countless genetic disease could have potential cures further down the road. There are two paths to Gene therapy. The first path is direct transfer of genes into the patient. The second path is the use of living cells as vehicles to transport the genes of interest. These two paths both have certain advantages and disadvantages. Direct gene transfer is the most simplistic way of administering the gene of choice. There are two methods to direct gene transfer. The first method is the process in which genes are delivered via liposomes or other biological microparticles into patient’s tissue or bloodstream. The second method of the introduction of genes is using genetically-engineered viruses, such as retroviruses or adenoviruses. However, due to biological safety concerns, viruses must first be altered so they are not infections before introduction. However, due to the simplicity of the direct gene transfer method, there are major weaknesses. For example, it does not allow for the control of where the therapeutic gene will insert. The transferred gene will either randomly insert itself into the patient’s chromosomes or remain unintegrated in the targeted tissue. Moreover, the target tissue may not be easily accessible for direct gene application of the therapeutic gene. The second method of gene therapy is the use of living cells to deliver the therapeutic gene. This method is very complex compared to the direct gene transfer method. There are three major steps to this method. The first step is cells from the patient are isolated and propagated in vitro. The second step is the introduction of the therapeutic gene into these cells using methods similar to the direct gene transfer. The last step is the genetically modified cells are returned to the patient. The advantage of using gene transfer vehicles is, in the laboratory cells can be manipulated more accurately and precisely than in the body. In addition, some of these cells are able to continually propagate under laboratory conditions before reintroduction into the patient. Moreover, some of these cells types have the ability to localize to particular regions of the human body, for example, hematopoietic (blood-forming) stem cells can return to the bone marrow upon reintroduction in to the body. This action can be very useful for applying the therapeutic gene which has regional specificity. However, a major disadvantage is the biological complexity of the living cell’s environment. The isolation of a specific type of cell requires not only extensive information of its biological markers, but also knowledge of the requirements for that cell type to stay alive in vitro and continue to divide. Unfortunately, there are many cells types with unknown information to their specific biological markers. Moreover, many normal human cells cannot be sustained in the lab for long periods of time without amassing deleterious mutations.

Industrial Applications


Animals that have transgenes have been produced to for testing on chemical safety as these animals are sensitive to toxic things. Also, these transgenic animals may produce something that can be utilized in biochemical reactions. Microorganisms have been structured to be able to produce enzymes that can make major reactions speed up.

Production of Transgenic Animals


The production of transgenic animals is taking the genome, the genetic makeup of the organism and introducing foreign genes into that organism. These insertions of genes are known as transgenes. Most importantly, these foreign genes must be transmitted through the germ line of the organisms. As a result, every cell, including the germ cells, whose function is to transmit genes to the organism’s offspring, contains the same change in genetic material.The predominant method of creating these transgenic animals is the use of DNA Microinjection. However, producing these type of transgenic animals is hardly deemed a success as DNA insertion is arbitrary and success rate very low. The offspring is what's studied for this new transgenic gene. But the ability to produce these type of offspring that is successfully carrying the gene is extremely difficult.

Scientists may produce transgenic animals is three main ways: DNA microinjection, retrovirus-mediated gene transfer and embryonic stem cell mediated gene transfer.

File:DNA Microinjection.jpg
Transfer of desired gene into mouse via microinjection, [34]

DNA microinjection


Technique summary


The first animals to be experimented with DNA microinjection was the mouse. DNA Microinjection is the transfer of a desired gene into the pronucleus of the reproductive cell. This cell is first cultured in vitro. Then reaching to a specific stage or threshold of the embryonic phase, it is then transmitted into the recipient female.

Technical Explanation

Retrovirus mediated Gene transfer, [35]

The pipets for this technique must be created especially from glass that are extremely thing and a pipet puller as well as a micro-forge. It must be absolutely flat at the tip or there will be impedance when injecting into the embryo. The specification of the DNA injection pipet should have an internal diameter of about 1 µm or even less. When performing this technique gloves that are covered with talc should be avoided as the power has the potential to clog the pipets and could lead to the failure of the embryos. The embryo that is working with should be put in very low magnification. Using the pipet, with ease suction the embryo into the end and let it stay there. The tip of the pipet is brought to exactly where the pronucleus is and then it is punctured through the cell membrane and into the cytoplasm area. It is often hard to see if the pipet tip has gone through into the pronuclear membrane. The only safe bet in judging if it was transferred successfully is to glance at the pronucleus to see if it swells up and its size in volume amounts to around 1pl. After injection it is then moved to the far end of the dish so that the next one may be done as well. When a bundle of embryos are complete, it is left for incubation and then evaluation for a duration of time. The embryos that are viable will then be transmitted to a female's oviduct and then utilized.

Retrovirus-mediated gene transfer


Retroviruses are used as vectors to transfer genetic material in the form of RNA rather than DNA. It is the transfer of genetic material into the host cell, resulting in a chimera, a organism that has various genes aside from its own. These chimeras are inbred for as many as twenty generation until homozygous offspring are formed, carrying two copies of the same transgene in all of its cells. This has been proven successful in 1974, when a virus was used as a vector into embryos of mice. They showed the desired transgene.

File:Embryonic stem cell mediated gene transfer.jpg
Embryonic stem cell mediated gene transfer, [36]

Embryonic stem cell mediated gene transfer


The technique involves isolation of the totipotent stem cells from embryos(stemcells that can develop into any type of specialized cell). The desired gene is inserted/transfer into the stem cell. These stem cells containing the desired DNA of interest are now incorporated into the host's embryo. Thus resulting in a chimeric animal. A major benefit of this technique is that it may test the transgenes on the molecular level, which essentially saves ample time and using this technique one would not have to wait for living offspring.

Stem Cells (in Further Detail)


What exactly are stem cells?

Stems cells are now a hot topic for research because of their seemingly endless potential. They are cells that may develop into numerous different categories of cells in the body during the beginning stages of life and also during the growth stage. Stems cells can also be utilized as an internal repair system, basically dividing incessantly to restock the cells under damage and repair until it reaches back to equilibrium and for the duration of the organism's life span. As stem cells divide, each has the opportunity to choose between sticking as a stem cell or become more specific- one with a specialized function, examples including liver cell, a white blood cell, brain cell, etc.

How are stem cells set apart from other types of cell?
There are two primary properties that are used to do this. The first aspect is that stem cells are initially unspecialized cells and may regenerate through cell division, and even at times after prolong periods of time without activity. Another aspect is that under right, specific physiological conditions can be promoted to turn into either tissue or certain cells of organs and with distinctive abilities. Examples of when stem cells are maximized for their repair function are in organs, bone marrow, gut marrow, where they constantly divide to restore injured cells or ones that have been heavily used.

In the past, researchers mainly worked with two categories of stem cells which were from both humans and animals. The two that were worked on were embryonic stem cells and somatic stem cells, which can also be called adult stem cells. The first embryo cells came from mouse as described above, which occurred around 1981. The human embryonic stem cells were made for reproduction and was made possible through the intense research done with the mouse embryos. Recently, there has been a third category of stem cells known as induced pluripotent stem cells(iPSCs). These cells are unique because they are cells of adults that can be reconditioned by gene modifications to be a stem cell.

Why are stem cells valuable for living organisms?
Typically in blastocysts, which are embryos of only 3–5 days aged, the cells on the inside will turn into the cells for all of the body of that living organism, even specialized cells and organs including the skin, heart, lung, reproductive cells-sperm and egg, and different tissues. Within the tissues of adults including bone marrow and muscle, these stem cells have the ability to replace the cells that are damaged, affected by disease, or simply just used.

Research in the stem cell arena, has continued to add new insights to the development of organisms from cells and the repairing mechanism of affected cells. Stem cells may also be utilized to help select for new drugs to be brought to market and better understand not only cell developments but also the irregularities that induce defects in the infants of organisms.

Special characteristics of stem cells
Stem Cells are very unique and set apart from other cells of the body. All types of stem cells will have 3 defining characteristics- able to divide and replenish themselves for long duration of time, not specialized, able to be turned into numerous different types of cell types. For each of these properties, further depth analysis will be explained below.

The first property discussed was stem cells ability to divide and replenish themselves for a long duration of time. Typically cells of muscle or nerve do not duplicate by themselves, but stem cells have the ability to do this and also done ample times. Stem cells replicated countless times in the laboratory times for months at a time may result in millions of cells. If the cells can go on for a long time and not be specialized just like the parents, these cells are able to perform self-renewal for the longterm. Two sources of profound interest under study about this self-renewal for the longterm is how embryonic stem cells can replicate for an entire year in the laboratory and not differentiate but usually non-embryonic stem cells are not capable of this and which aspects of organisms are the ones that are source of regulation of stem cell replication and this self-renewal.

Finding out how the regulation of stem cells is performed for stem cells normal development may assist in finding out the reasoning for cancer through irregular cell division. This could also lead to more efficient growth embryonic and non-embryonic stem cells performed in the laboratory setting. Having stem cells that continue to stay as unspecialized result from special conditions. These special conditions are set up from signals in the cells that induce the stem cells to replicated and stay as unspecialized.

Stem cells are ones that are not specialized. Since they are not specialized, they are incapable of doing any specialized tasks that could occur in specific tissues or organs. As a result, stem cells cannot work collaboratively with other cells to perform organized tasks such as being a carrier of oxygen molecules throughout the body such as red blood cells. But what is unique about stem cells is their potential to be made into specialized cells such as nerve cells, brain cells, or muscle cells.

Stem cells have the capability to be made into specialized cells. The progression of stem cells that are not specialized being turned into ones that are specialized is known as differentiation. The differentiation process have multiple steps and the progression through these steps increases specialization. Many factors help to control this progression. Signals that both inside and outside of the cell help promote the stem cells through each stage. Outside signals include being in close, touching proximity of nearby cells, chemicals that are given off by other cells, and the presence of specific molecules in the immediate environment. Inside signals are managed by genes present on the DNA that tell it exactly what to do. Understanding the regulation of these stem cells can help to grow cells or even tissues to help in selecting for drugs and cell therapy, which is what makes stem cells so special and a primary source of research.

Different types of stem cells

Embryonic stem cells
These type of cells come from embryos. A major portion of these types of cells come from eggs that are fertilized in vitro or in the laboratory setting and then given to labs so that research may be done on them. The embryos from the human stem cells are usually about 4–5 days aged and are in the blastocysts form, which essentially is a hollow ball of cells. The blastocysts have a total of three structures including the trophoblast, embryoblast or pluriblast, and blastocoel. The trophoblast is a layer that surrounds the blastocoel. The hollow cavity of the blastocyst is the blastocoel and the embryoblast is mass of cells that will turn into defined structures of the fetus.

How are embryonic stem cells identified?

While creating embryonic stem cells, there are various checkpoints to test if the cells have the right properties that allow it to be called embryonic stem cells. This is also known as characterization. There is not a universal test agreed to always be used to mark embryonic stem cells but there are various tests that can be used. The first one that can be used is to grow these stem cells for a number of months. This proves that the cell can do long-term growth and self-renewal. The cells are put under a microscope and observed to see that it is in good condition and still have not differentiated. A second test that is to determine transcription factors that are characteristic of cells that are not differentiated. Specific transcription factors to look for are Nanog and Oct4. Essentially what transcription factors do is aid in turning genes either off or on when needed, which is very integral in cell differentiation and development of embryos. Nanog and Oct4 help to keep the stem cells to be undifferentiated. A third test is to use specific techniques to look for cell surface markers that undifferentiated stem cells will give off. A fourth test is to look at the chromosomes using a microscope and to diagnose if there is damage or if the quantity of chromosomes is different. A 5th test is to see if the cells can be grown again after putting it in the freezer and then allowing it to thaw. The last test which is the 6th one is to test if these human embryonic stem cells are pluripotent. This may be done by permitting the cells to instinctively differentiate in the laboratory, conducting the cells so that it will form a cell that consists of three germ layers, or injecting the cells into a mouse that has an impaired immune system to test for the development of teratoma, a tumor that is benign. The growth of the injected stem cells and its differentiation may be observed since the immune system of the mouse does not reject it. Encompassed in the tumor cells is a combination of differentiated or somewhat differentiated kind of cells, showing that embryonic stem cells have the ability to differentiate into other different types of cells.

How does differentiation of embryonic stem cells occur?
When embryonic stem cells are kept under the right conditions, they can be kept in the unspecialized state. When cells are permitted to aggregate and form what is called embryoid bodies, spontaneously differentiation occurs. These cells are able to form numerous different types of cells. This does show that this sample of embryonic stem cells is good condition, however this method is not efficient in creating certain cell types.

Mouse embryonic cells that are directed in differentiation, photo by Terese Winslow, [37]

In order to generate cultures of specific types of differentiated cells such as blood cells or brain cells, is done by controlling the differentiation of these embryonic stem cells. Components to modify are the different chemicals the culture medium is made of, the surface of the culture dish, or even the cell themselves by giving them specific genes. After a long time of trial and error there have been some standard protocols established for this directed differentiation to certain cell types to occur. If this directed differentiation of embryonic stem cells is done successfully, they can be used to treat certain diseases which include Parkinson's disease, Duchenne's muscular dystrophy, heart disease, vision loss and traumatic spinal cord injury.

Adult stem cells
Adult stem cells are thought to be undifferentiated type of cells, located with differentiated cells either in a tissues or organs that can revitalize itself and may differentiate to give either some or all of the primary specialized types of cells of an organ or tissue. The main job of adult stem cells in organism are to sustain and restore the tissues where they are located. Unparalleled to embryonic stem cells that are named according to location in which they are found, the stock of some adult stem cells in some tissues that are already mature are still being researched.

As more research is being conducted on adult stem cells, their presence is being found in many additional areas of tissue than ever before. This has opened up the possibility of these adult stem cells to be used as transplants. A widespread use of adult stem cells as transplants are for hematopoietic stem cells from bone marrow, which is blood-forming. It is now evident that stem cells do exist in the heart and the brain. The control of differentiation of these stems cells if done correctly it may be feasible to use them for transplantation therapy treatments.

Adult stem cells were first discovered in bone marrow, which contained two versions: hematopoietic stem cells and bone marrow stromal stem cells, which were discovered second. The stromal cells were small in number but had the ability to make everything including fat, bone, cartilage, and fibrous connective tissue.

Location of adult stem cells and their role?
Adult stem cells are actually located in numerous different organs and tissues which include bone marrow, brain, blood vessels, skin, teeth, heart liver, epithelium part of ovarian, and testis. Within each tissue, stem cells live in a particular area. In a lot of tissues, some stem cells comprise the outside layer of small blood vessel known as pericytes. Stem cells usually do not divid for long durations of time until prompted to for normal maintenance of tissues, after injury, or by disease.

Normally the number of stem cells in each tissue is small and once taken away from the body, their ability to divide becomes limited and duplicating large amounts of stem cells difficult. As a result, researchers are looking for improved ways to grow large quantities of adult stem cells in the laboratory so that specific ones may be created to target and treat diseases and injuries. Uses include to recreate bone from cells located in the bone marrow stroma, making cells that produce insulin to help treat diabetes of type1, and to rejuvenate heart muscles that were greatly impaired after a heart attack event.

Identification of adult stem cells

There are many methods to identifying stem cells. Researchers typically use several methods to identify the adult stem cells. One way it occurs is to tag the cells that are in living tissue with molecular markers and then look to see the produced specialized cell types. Another useful method would be to take the cells from a living organism, tag them in the laboratory and reinsert them into another organism to observe whether or not the cells recreate cells at their original tissue location.

One of the primary things that must be exhibited is that one adult stem cell will be able to produce an entire colony of genetically identical cells that can also create the correct differentiated cell types of that particular tissue. To produce these results experimentally and confirm that the cells are indeed adult stem cells is done through showing that it can create genetically identical cells or that the cells can remake the tissues after inserted into another animal or both of these.

Adult cell differentiation

Differentiation of Both Hematopoietic and stromal stem cells, photo by Terese Winslow, [38]

Normal differentiation
Adult stem cells are free to divide when called and can produce mature cells that have the same shapes, structure, role of that tissue in which it resides. Examples of this will follow. Hematopoietic stem cells will produce any type of blood cells including the b lymphocytes, T lymphocytes, natural killer cells, basophils, monocytes, red blood cells, etc. Mesenchymal stem cells actually produce a whole variety of cell types including bone cells, fat cells, cartilage cells, etc. Neural stem cells of the brain may produce neuron, astroyctyes, and oligodendrocytes. In the lining of digestive tract reside epithelial stem cells and they produce cells including goblet cells, enteroendocrine cells, absorptive cells, etc. The stem cells of the skin reside in the basal layer of the epidermis and produces keratinocytes, that provide the security layer.

Particular adult stem cells can differentiate into other types of cells of other organs or tissues than it's predicted type, such as heart muscle cells differentiating into brain cells. This type of differentiation is better known as transdifferentiation. This occurrence in human beings is still not fully proven. Some possible explanations for this type of differentiation being observed could be the junction of this donor cell with the recipient. Another explanation could be that these injected stem cells give off factors that promote that other organism's own stem cells to initiate the repair mechanisms. When transdifferentiation has been observed, it is only seen in small instances.

Scientists have proved that some adult cells can be remade into different cell types in the laboratory using precise gene alterations. This can prove to be a way to remake cells into the other ones that have been injured or eliminated because of diseases. In diabetes, the cells that produce insulin or beta pancreatic cells can be recreated by reprogramming other cells in the pancreas. These recreated cells were very close i appearance and shape to the actual beta pancreatic cells. These reprogrammed cells when put into mice did improve the regulation of the sugar levels in the blood even though the mice had nonworking pancreatic beta cells.

Adult somatic cells can be reprogrammed to mimic embryonic stem cells through the presence of genes of embryos, and these types of cells are known as induced pluripotent stem cells iPSCs. Through iPSCs cells can be introduced that receptive by the donor and will not be rejected, which is important when recreating new tissue. However, iPSCs are still under study until they can produced to entirely only stick to its designated cell type.

Similarity among stem cells
Both human embryonic and adult stem cells have similarities and its differences in relation to using for regenerative therapy or repairing already damaged tissue and cells. A primary difference between adult stem cells and embryonic is the amount of different abilities that each is capable and the specific kind of differentiated cell types they will turn into. Embryonic stem cells can actually turn into all the different type of cells in the body because of their pluripotent nature. Adult stem cells are very specific and so limited to only differentiating into the type of cells of their original tissue.

A noteworthy difference is that embryonic stem cells can be grown with great ease in the laboratory. Looking within mature tissues, the adult stem cells are limited in number so finding these cells may be difficult. Unlike embryonic stem cells, adult stem cells still do not have a way to be grown in the laboratory. This difference has a great impact as replacing cell mechanisms oftentimes requires an abundance of cells in order to work properly.

Moreover, the tissues created from either embryonic or adult stem cells may be different in probability of rejection rate post-injection or transplantation. Embryonic stem cells have not been researched too heavily yet as testing using cells from hESCs were only just now approved by the FDA(Food and Drug Administration). The adult stem cells and tissues that form as a result are presumed to be less probable to rejection post-transplantation. The success can be attributed to using patient's self cells to be duplicated in the laboratory and then induced to differentiate into a specific cell kind and then re-injected into that same very patient. Utilizing the adult stem cells and the tissue products from the patient's very own cells highly decreases the probably of rejection by the immune system. This proves to be a major benefit since only using immunosuppressive drugs can help fix this problem but then the drugs have side effects that come along as well.

Uses for stem cells

Using adult stem cells to repair heart muscle cells, photo by Terese Winslow, [39]

There are many uses for stem cells, especially in research and in clinic. Studying human embryonic stems cells will help give information about development of humans. The principal target is to pinpoint how undifferentiated stem cells become differentiated cells and then later to form organs and tissues. Gene regulation is imperative in this aspect. A lot of the most irregular activity in humans result from aberrant erroneous cell division and differentiation. New research has found that iPS cells show that specific factors are associated with genetic signaling and molecular signaling and introducing these into the cells in a proper manner to command these processes will need a special technique.

Stem cells of humans may be used to select for new drugs. These drugs can be tested to see that it is not damaging using these differentiated cells. A vivid example would be to use cancerous cells to select for drugs that could be anti-tumor. Environment of the drugs should be very similar in order to check if the drugs actually work and this can be done through having a precise command over where the differentiation of stem cells turn into.

Another widespread use of stem cells is to utilize them to create cells and tissues to repair damaged or disease tissue in cell therapy. These regeneration of cells and tissues can aid in treating disease such as Alzheimer's disease, stroke, heart disease, osteoarthritis, and spinal cord injury.

Checklist for successful transplant of stem cells
1) Duplicate in mass amounts and be able to produce enough quantities of tissue 2) Differentiate into wanted type of cells 3) Live to survive in recipient post-transplantation 4) Become integrated into the tissue in the proximity post-transplantation 5) For entire duration of organism's life- be able to correctly function 6) No detrimental effects on recipient

Ethical conflicts with stem cells?
The main concern with stem cells has to do with the human embryonic stem cells, which has created a lot of public interest and conflict. Stem cells that are pluripotent, or may become numerous different types of cells in the human body are created from human embryos that are some days aged. The major debate is of when does life technically commence and if embryos or even fetuses would be considered as such and also who has the power to decide on such an issue.

United States' position on stem cells
The Bush administration in 2001 offered federal funds for research on human embryonic stem cells if certain three criteria were met. However, President Barack Obama issued an Executive Order 13505 known as Removing Barriers to Responsible Scientific Research Involving Human Stem Cells on the 9th of March 2009. This allowed National Institutes of Health or NIH to take a different strategy on doing human stem cell research. Also this Executive Order essentially nullified both the Executive Order 13435 and the presidential statement that occurred on August 9, 2001.



Arlan Richardson. Photo. "Use of Transgenic mice in Aging Research.” 1997 <>

Gordon, Jon W. Photo. "Transgenic Technology and Laboratory Animal Science." 1997. <>

Gordon, Jon W. Photo. "Transgenic Technology and Laboratory Animal Science." 1997. <>

"2009 Executive Order Disposition Tables: Removing Barriers to Responsible Scientific Research Involving Human Stem Cells." <>. 11 March 2009. 2 December 2009.

Margawati, Endang Tri. "Transgenic Animals: Their Benefits To Human Welfare." ActionBioscience. Jan 2003. 15 Nov 2009 <>

"Stem Cell Basics." In Stem Cell Information. Bethesda, MD: National Institutes of Health, U.S. Department of Health and Human Services, 2009. <>. 3 December 2009.

"Transgenic Animals and Genetic Research."<>. 16 Nov 2009.

"What are Some Issues in Stem Cell Research."<>. 9 November 2009. 3 December 2009.

Winslow, Terese. Photo. 2001. 2 Dec. 2009. <>

Winslow, Terese. Photo. 2001. 2 Dec. 2009. <>

Winslow, Terese. Photo. 2001. 2 Dec. 2009. <>

Zwaka Thomas P. “Use of Genetically Modified Stem Cells in Experimental Gene Therapies.” <>

Zwaka Thomas P. Photo. “Use of Genetically Modified Stem Cells in Experimental Gene Therapies.” <> Transgenic plants are genetically engineered to have genes from other organisms inserted into their genome. Transgenic plants are identified as a class of genetic modified organisms (GMO). The introduced genes do not have to be from the plant kingdom, but can come from animals, viruses, or bacteria as well. The uses of exogenous gene introduction include virus immunity, a replacement for pesticides, the ability to grow in acidic soil, and greater nutritional content.

Making Transgenic Plants

Breeding transgenesis cisgenesis

Transgenic plants are constructed by inserting genes from other organisms into the host plant's DNA sequence. For this to happen a desired gene must be isolated and cloned. A few changes must be made to the gene so that it can effectively be inserted into the plant. First, a promoter sequence must be added to the gene. The promoter sequence is an on/off switch that controls where and under what cues the gene is expressed. The gene must also sometimes be modified (e.g. The Bt gene for insect resistance has a greater amount of A-T nucleotide pairs than plants, which tend to have more C-T pairs. The A-T nucleotides can be substituted for with C-T pairs in a manner that does not significantly change the amino acid sequence, leading to greater protection of the inserted gene in plant cells.). A terminal sequence must also be added to signal when the end of the gene sequence has been reached. Finally, a selective marker gene must be inserted to identify plant cells which have successfully integrated the transgene.

Agrobacterium System


A method that is used to transform plants is the Agrobacterium method and the "Gene Gun" method. The Agrobacterium method uses Agrobacterium tumefaciens, a soil-dwelling bacterium that has the ability to infect plant cells by introducing transfer DNA, or T-DNA of a tumor-inducing (Ti) plasmid (i.e. a DNA sequence that can replicate independently of chromosomal DNA and is often circular) to the host's nuclear DNA. The bacteria is part of the rhizobiaceae family which is responsible for many tumors found in plants. The Ti plasmid contains the T-DNA as well as a series of vir (virulence) genes that direct the infection process. Agrobacterium tumefaciens can be used as a vector for gene transfer into plants. First, a hybrid plasmid that carries only the T-DNA from a Ti plasmid is cut open with a restriction enzyme and a foreign gene is inserted, creating a recombinant plasmid. The recombinant plasmid is then transferred into an Agrobacterium tumefaciens cell that contains a Ti plasmid that has had its T-DNA removed. The Agrobacterium with the engineered plasmid is then used to infect a plant and integrates the T-DNA with the foreign gene into the plant genome. For the Agrobacterium to be used the DNA must be able to penetrate into the plant cells. This is often done with electroporation, where brief high-voltage electrical pulses are administered to naked protoplasts (i.e. plant tissues and DNA). The electrical pulses open the pores in the plasma membrane allowing the DNA to enter the protoplast (which can then be grown into a mature plant by treating it with hormones). In the "Gene Gun" method, gold or tungsten microspheres (about 1 micrometer in diameter) are coated with the DNA or RNA from the specific gene of interest. The microspheres are then accelerated into undifferentiated target cells in a petri dish. Once inside the cells, the gene from the DNA coating the microsphere is released and can be incorporated into the host plant genome. The advantage of this method is that a high percentage of a single copy of T-DNA can used to transform the plant. In addition, they are an abundant of vector system available to carry out this method.

Biolistic Method


This method delivers microprojectiles that are coated with DNA by accelerating it into the cell of interest. The microprojectiles are usually made up of tungsten or gold. To carry out the acceleration, an explosion is made with gunpowder under high pressure of helium. Plants that are made using the boilistic method have multiple copies of a gene that is still able to segregate in a Mendelian pattern. This method helps increase the diversity seen in plants. There are some advantages to the biolistic method compared to the Agrobacterium method. The plants that undergo the bombardment of genes in this method are still fertile. Other advantages includes this is the only reliable method to transform the chloroplast and this method does not need any transformation vector.

Importance of Transgenic Plants


The new methods developed to transform plants have opened a new field of interest. Transgenic plants are used to solve a lot of problems in the agriculture sector. In addition, transgenic plants can be used in the medical field

Nutrients of Transgenic Plants


When people go to the supermarket, they often buy fruits that are not soft or overly ripened. The major problem in the agriculture field with fruits is that the fruits often become soft during processing and transporting because they are being ripened. Using one of the methods for creating transgenic plants, scientists are able to slow down the process of ripening. Three companies have been able to apply this technology to slow down the ripening of tomatoes. And now other companies are hoping to be able to do the same for other fruits such as mangos or papayas. Cereal grains and legume seeds are a big source of protein for many people. However, the cereal grains and legumes seed often lack certain amino acids such lysine in cereal grains and methionine in legume seeds. Many efforts have been put into creating seeds that are higher in nutritional values. Currently, transgenic tobacco and canola seeds have a 33% increase in methionine due to the transgenic technology. In addition, the nutritional values have potatoes have increased by transforming it with AmA1, a gene from amaranth.

Increasing the nutritional values in plants and fruits can address many malnutrition problems and diseases. Vitamin A deficiency is a huge problem in Asia that affects around 124 million children and causes blindness. The main staple in Asia is rice, but rice does not contain any vitamin A. Researches are being performed in hope of developing rice that is rich in vitamin A. Currently, scientist have found the genes that encode for B-carotene (pro-vitamin A) enzymes in the endosperm of transgenic rice seed and they hope to use this information to engineer rice in a way that vitamin A can be produce through the rice.

Uses of Transgenic Crops


The use of transgenic plants for pathogen resistance has received the most attention from popular media. The use of GMOs has been a topic of debate since their introduction in the mid-1990s. The two best known cases were virus-immunity in papayas and insect immunity in crops such as corn through a gene from Bacillus thuringiensis (BT). The papaya ringspot virus (PRSV) that severely damages papaya trees was causing a major toll on the papaya industry in Hawaii. Genes for the protein coat of the virus were inserted into papaya tissue by using the gene gun. Some of the papaya cells incorporated the viral genes into their DNA, giving the plant immunity to PRSV. This saved the Hawaiian papaya industry. The introduction and use of BT crops is even more publicized. The BT gene codes for the Cry proteins which are toxic to and that specifically target and kill the larvae of butterflies and moths. By introducing this into plants, crops such as corn, rice, and potatoes were able to exhibit the Cry proteins, and have proved to be very effective at stopping insect pests such as the European corn borer caterpillar. The protein is very selective and does not harm other insects (e.g. beetles, flies, bees, wasps) and is also considered safe for human consumption. The use of the BT endotoxic has greatly reduced the use of pesticides on crops. However, issues concerning immunity of the pests to the BT corn are a problem, and refuge crops that do not contain the toxin are planted to reduce the evolution of the caterpillar immunity to the Cry proteins.

GMOs have also been bred to improve food nutritional quality, to induce a longer shelf-life by delaying senescence, to allow corn to grow in acidic soil, to protect strawberries from cold temperatures, and a variety of other uses.



Bessin, Ric. Bt-Corn: "What It Is And How It Works". University of Kentucky College of Agriculture. January 2004.

Transgenic Crops: An Introduction and Resource Guide. Colorado State University Soil and Crop Sciences. March 2006.

Lipps G (editor). (2008). Plasmids: Current Research and Future Trends. Caister Academic Press. Raven, Peter. "Biology of Plants". W.H. Freeman and Company. New York. 2005. "Harvest of Fear" (Film) - Nova. PBS. 2004

Peña, Leandro. Transgenic Plants: Methods and Protocols. Totowa, NJ: Humana, 2005. Print. Structural Biochemistry/The Hypochromic effect Because DNA contains all of the heredity information and the instruction for protein production, it is crucial that there be very few changes to the DNA. DNA is constantly bombarded by radiation and chemical mutagens that can cause mutation. However the rate of mutation is very low because of the four main type of DNA repairs.

DNA Injury Detection and Signaling


The human genome is under constant toxic stress from normal cellular conditions such as free radicals or errors in DNA replication, as well as extrinsic conditions such as UV radiation. To combat these stresses and properly maintain the genome, the DDR pathway, or DNA damage response pathway has evolved. This pathway serves to detect errors or abnormalities, propagate the detection signal, and activate systems to correct the issue. If the damage is irreparable, the cell undergoes apoptosis, or programmed cell death, to avoid passing on the potentially lethal errors in DNA. Cells come across DNA damage constantly, so the DDR pathway is vital to cell survival.

The most lethal form of DNA damage comes from ionizing radiation which causes breaks in the double stand. The repair protein RAD51 quickly collects into foci at sites of DNA damage. It is suggested that damaged induced phosphorylation of the histone variant H2AX indicates the sites of DNA breaks; many other repair proteins also collect at these sites of H2AX accumulation. In mice lacking H2AX, immune system degradation and increased incidence of tumors are found.

The major regulators of cellular response to DNA damage are ATM and ATR kinases (ataxia telangiectasia mutated) through the regulation of phosphorylation of over 700 proteins. This phosphorylation is the initial step in the signaling of DNA damage.

“Structural Dynamics in DNA damage signaling and repair” was an article written by JJ Perry, Elizabeth Cotner-Gohara, Tom Ellenberger, and John A. Tainer. In this article, DNA damage responses are studied in aspects that reveal the role of protein in such pathways. DNA is continually damaged by metabolites and toxicants. Thus, DNA repair and damage response are essential in the function of life. There are three steps in which DNA damage is involved. The damage is first detected, removed, then eventually replaced with the correct DNA sequence. The pathway regenerates a 3’ terminal that will be extended using DNA polymerase with an undamaged strand as the template. The repair is completed with a ligase resealing the DNA backbone. Because this process of repair generates toxic intermediates, strong “genetic selection” is required as the DNA is being restored. Proteins structures are found to be connected to the coordination of steps within the DNA damage response and repair pathways. This is very important because proteins are once again, related in the DNA replication process.

When different methods come together, the dynamics of DNA repair complexes can be studied in great details. Such methods involve X-ray crystallography, NMR, SAXS – small-angle X-ray scattering, DXMS – hydrogen-deuterium exchange mass spectrometry, etc. These methods provide information as small as from the nanoscale to atomic level. For instance, SAXS gives information on the flexibility of macromolecules in solutions. It also provides information on the entire pathways and their interactions in solution. In addition, DXMS shows more on the conformation changes that take place during the repairing process as detailed as the resolution of a single amino acid. Thus, combining different structural biochemistry methods helps scientist in discovering the different coordination’s between DNA repair and damage response system. Current studies found that the “Transition between different enzyme conformations can involve non-native interactions that lower the energy barrier for inter-conversion between different states” (1). This discovery is very important because it describes the connections between the changes in the DNA repair complex (conformation changes) and the biological outcomes occurred through such changes. For instance, as stated, the changes in enzyme conformation cause the lower of activation energy for the conversion between different states during the process of restoring damaged DNA. Another example is that changing the normal protein flexibility and the stability of the repair protein system can cause great genetic diseases. Changes in DNA and ATP binding are found to be related to cancer as well as how the defects in the flexibility and stability of DNA repair framework are related to aging disorders such as Cockayne Syndrome or TTD.

The damage repair is carried out by the multi-domain nucleotide excision repair helicase (NER). This enzyme removes bulky and distorted cut from one strand of the DNA needed to be repaired. This is a very precise process where only the defected strand is removed without affecting the undamaged DNA strand because the undamaged DNA strand serves as the template for the modification and repairing process. The NER proteins are assembled in a way that allows for the verification of the damaged site before the actual removal of the DNA backbone. One example of DNA repairing process is on the performance of Yeast Rad4, a multi-domain protein that binds to the distorted part of the helix being repaired by NER. The binding of the protein is showed to stabilize the distorted DNA structure. Observations show that Rad4 inserts a beta hairpin through the DNA helix to relocate its bases. One surprising discovery was that instead of binding to the damaged DNA strand, Rad4 is bound to the undamaged one. The result was that the helical axis is offset due to the damaged DNA strand, causing a bend in structure that increase the Rad4 DNA interaction surface to the neighboring hairpin regions. This extending interaction creates a more stabilized damaged DNA, though its bases are now exposed to the solvent. This stabilization aids NER as it is repairing the damaged strand.

Another important component in the DNA damage response and repair is BER – base excision repair pathway. The difference between BER and NER is that BER has the ability to detect and remove single nucleotides with the smallest modification such as the addition of one single methyl group. Thus, it is extremely efficient in fixing distorted DNA strands. In BER, the oxidative damage-specific glycosylates OGG1 and MutM are found to interact with 8-oxoG bases. 8-oxoG bases are composed of a hydrogen-bond donor N9 and an accept O8. They interact with OGG1 and provides selective cut of the damaged DNA. This entire complex is known as the pseudo-Michaelis complex. Overall, different mechanisms were observed in the process of DNA damage response and repair from the combination of methods ranging from NMR, X-ray crystallography, to SAXS, etc.

Below is an image of a process of DNA repair where the DNA ligase I is repairing a chromosomal damage.

DNA Repair

Role of 9-1-1 in DNA Repair


DNA repair consists of the detection of existing damage and the actual healing of this impairment. 9-1-1 is a heterotrimeric protein, consists of three sub-units in which at least one is different than the other two, that wraps around DNA to initiate the recruitment of specific checkpoint proteins and freezes the cell cycle temporarily. More specifically, it causes phosphorylation of Sc-Mec1/Hs-ataxia telangiectasia, where Sc- and Hs- prefixes refer to Saccharomyces cerevisiae (a eukaryotic species) and Homo sapiens respectively, and Rad3. Chk1 and Sc-Rad53/Hs-CHK2 protein kinases are activated resulting in the inhibition of cell cycle phases G1/S intra-S or G2/M. Accumulation of repair genes, fixation of the replication fork, and the decrease in production of cyclins (proteins that progress the cell cycle) also result from this activation. 9-1-1 works with Sc-Cdc28 to selectively accumulate Sc-Ddc2. The presence of Sc-Ddc2/Hs-ATRIP, Sc-Mec1/HS-ATR, and 9-1-1 together activates the checkpoint regardless of the detection of DNA damage.

Mismatch Repair


Mismatch repairs corrects any mistakes in nucleotide pairing that escape the proofreading ability of DNA polymerase during replication. Base nucleotides that are incorrectly paired causes deformity in the secondary structure of DNA. The MSH2 and MSH6 dimer binds to the mismatch on the strand. Then, MLH1, an endonuclease, will bind to the MSH and nick the strand. Then exonucleases will degrade the region in between and then allow DNA polymerase delta to place the correct nucleotide and DNA ligase will re-connect the strand. Using this ability, the enzyme cut out the distorted portion of the new DNA strand and then use the old DNA strand as a template to fill in the gap. In E.Coli, the mismatch repair enzyme recognizes the old DNA strand by the presence of methyl groups on certain sequences. In eukaryotic cells, it is unknown how the enzyme is able to distinguish between the old and new DNA strands.

Direct Repair

Pyrimidine Dimers

In direct repair, instead of replacing an entire nucleotide, the wrong nucleotide is structurally changed to the right nucleotide. UV ray from the sun causes pyrimidine dimers by forming covalent bonding between adjacent pyrimidines. Some eukaryotic cells have an enzyme called photolyase. The enzyme breaks the covalent bond between the pyrimidine dimers with the energy from light.

Nucleotide Excision


NER Helicase


DNA repair is carried out by the nucleotide excision repair (NER) helicase, a protein that is composed of multiple domains. NER assembles around damaged DNA regions (which, because of their error, contain a bulge or lesion that encourages NER to bind) in a stepwise manner, allowing damage to be carefully verified before the actual excision is performed. For example, yeast Rad4 protein (an analogue of mammalian XPC) indirectly detects DNA damage by binding to a nearby undamaged region. The damaged DNA strand is flexible, allowing a stable complex to form which includes Rad23, the protein that actually repairs the damage.

If XPC-Rad4 cannot detect a damaged site, one alternative involves the DDC1-DDC2 dimer. This dimer forms a complex with a damaged DNA region and an ubiquitin ligase. The complex ubiquitinates XPC and DDC2, the latter of which then releases the DNA molecule, passing it on to XPC and the normal NER process.

Nucleotide Excision Repair can be divided into two subcategories: Global Genome Repair and Transcription Coupled Repair.

Global Genome Repair involves the XPC and hHR23B dimer binding to the damages DNA and then Transcription Factor 2H (TFIIH) bind to the complex. Then XPG binds and the DNA is further unwound. The nucleases XPG and XPF cleave the DNA, which essentially removes the damaged DNA. Then DNA polymerase delta fills in the gap with the correct nucleotide and then DNA ligase re-connects the strand.

Transcription coupled repair is when RNA polymerase stalls at the damaged site and then Cockayne Syndrome B protein (CSB) displaces RNA polymerase and recruits TFIIH and XPG. The DNA is unwound before the nucleases XPG and XPF cleave the DNA. Then the damaged section is removed and DNA polymerase delta fills in the gap and ligase re-connects the strand.

Source: Molecular Cell Biology, Lodish et al., 6th edition (2008), pages 145-160

The Base-Excision Repair pathway

BER basic pathway

Not all damages are large enough to cause the lesions that are detected by NER. The base excision repair (BER) pathway repairs single nucleotide errors, sometimes as slight as the addition of a methyl group. While small, these damages can often be enough to impede DNA replication or produce nonfunctional proteins. Damage detection in the BER pathway is difficult because, in addition to the errors being small, there are a large number of them. Numerous enzymes are used to detect different small errors and initiate the BER pathway.

The first step in base-excision repair is the excision of modified nucleotide. Enzymes called DNA glycosylases, each has its own ability to recognize certain type of modified bases, cleave the bond between the 1'-carbon of the deoxyribose sugar and the base and remove the base. Then enzyme called apurinic or apyrimidinic (AP) endonuclease breaks the phosphodiester bond and another enzyme removes the deoxyribose sugar. DNA polymerase comes and adds the correct nucleotide to a free 3'OH group. Finally, DNA ligase connects the DNA strand by forming phosphodiester bond.

Backbone repair and DNA ligase


Damage to the sugar-phosphate backbone of DNA is repaired by DNA ligases. Because the DNA backbone is common to all organisms, these ligases are likewise found in every organism that uses DNA as its genetic material. DNA ligase seals breaks in the backbone by a three-step process. In the first step, several of the enzyme's domains adopt a specific conformation, allowing an active site lysine residue to be adenylated. In the last two steps, the enzyme encircles the broken DNA strand and fuse the two ends together.


Double-Strand Break Repair


Breaks in the double strand of DNA are common, but particularly hazardous to the cell due to increased chance of genetic mutation. Major causes of double strand breaks include reactive oxygen from oxidative metabolism, ionizing radiation, and enzyme errors. The strand could be repaired in one of two major ways: homologous-directed repair and the nonhomologous DNA end joining pathway (NHEJ).

Homology-Directed Repair


Any diploid organism could use homology-directed repair, even if the diploidy is temporary, as in bacteria. Types of homology-directed repair include homologous recombination, single strand annealing, and breakage-induced replication. In homologous recombination, an identical or nearly identical sequence of DNA is required as a template for repair during the S phase of the cell cycle, which occurs only during and shortly after DNA replication, and before mitosis. Nucleotide sequences are then exchanged between similar strands.

Nonhomologous DNA End Joining Pathway (NHEJ)


NHEJ arose as an alternative to homology-directed repair, as template donors are usually not available in nondividing cells. With a remarkably flexible mechanism, NHEJ has a wide diversity of substrates that can be converted into the desired product. Like other DNA repair processes, it requires three main proteins: a nuclease to resect damaged DNA , polymerases to fill in new DNA, and a ligase to the restore the strand. Key components include Ku, DNA-PKcs, Artemis, Pol x polymerases, and the ligase complex consisting of XLF, XRCC4, and DNA ligase IV. Each DNA end could then be modified independently multiple times, and substitutions with other enzymes is permitted due to its flexible nature. The problem of joining heterogenous DNA ends at double-strand breaks was shown to have evolved convergently in prokaryotes and eukaryotes.


  1. Huen, M. SY. "Assembly of checkpoint and repair machineries at DNA damage sites." Trends in Biochemical Sciences, Volume 35, Issue 2, 101-108, 28 October 2009
  2. Perry JJ, Cotner-Gohara E, Ellenberger T, Tainer JA. “Structural dynamics in DNA damage signaling and repair.” Curr. Opin. Struct. Biol. 2010 Jun; 20(3)
  3. Pierce, Benjamin A., Jung H. Choi, and Mark E. McCallum. Genetics: a Conceptual Approach. New York, NY: W.H. Freeman, 2008. Print.
  4. Lieber MR. The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annu Rev Biochem. 2010;79:181–211.
  5. Perry, J. Jefferson P., Elizabeth Cotner-Gohara, Tom Ellenberger, and John A Tainer. “Structural Dynamics in DNA Damage Signaling and Repair”. Current Opinion in Structural Biology. (2010): 283-294. ScienceDirect.
  6. Eichinger, S. Christian and Stefan Jentsch. "9-1-1: PCNA's specialized cousin." Trends in Biochemical Sciences, Volume 36, Issue 11, 563-568, 04 October 2011.



Mismatch Repair in mammals is an important mechanism in the overall processes of DNA repair. Mismatch Repair (MMR) works by removing incorrect base pair match-ups in double-stranded DNA and replacing it with the correct base pair. However, MMR has other known functions, including mutagenesis in different in vivo conditions.

Canonical MMR Mechanism


Errors in DNA replication pose many problems to both the integrity of the DNA and to the individual. MMR is one way that these errors are fixed, as it is known that deficiency in MMR causes cancerous tumors in animal models.

The basic MMR system relies on the proteins, MutSα, MutLα, EXO1, RFC, PCNA, RPA, polymerase-δ, and DNA ligase I. There are three basic steps of MMR known as licensing, degradation, and resynthesis.



In licensing, MutSα binds to the mismatch error in the DNA strand, which causes a change in the conformation of MutSα into a sliding clamp. This change is dependent upon an exchange of ADP for ATP. MutLα is recruited to forma ternary complex with MutSα, which then diffuses along the DNA strand until it reaches PCNA.

PCNA, or proliferating cell nuclear antigen, is a protein that can undergo a conformational change to become a ring around DNA. To attach to the DNA, it relies on the function of RFC, or replication factor C. This protein uses ATP hydrolysis to attach the PCNA to the DNA. This attachment is only efficient when there is a "nick" in the DNA, or an apyrimidinic (AP) site. The PCNA can then attach to the 3' end of the nick. While RFC can add PCNA without a nick in the DNA, this is down with extremely low efficiency.

Once it reaches PCNA, the cryptic endonuclease of MutLα is activated and causes additional nicks that are on both sides of the mismatch error on the same strand. This is only necessary for PCNA binding on the 3' side of the mismatch. Nicks are only made on the same strand as the mismatch because PCNA is not symmetric and has distinct sides to it. As such, MutLα can only interact with PCNA on a specific face and the complex will have a certain orientation, which remains constant even when sliding across the DNA. MutLα has endonuclease activity on one of its heterodimer subunit, PMS2, and this will only nick the same strand as the PCNA binding.

The reason that nicks are made close by to the mismatch (which is essential for DNA repair) is because the complex making the nicks, MutSα/MutLα, has the highest number around the mismatch site, correlating with greater PCNA collision frequency. This is especially important in replication, where the PCNA molecules adhere to the DNA for an extended period of time even after replication. Due to RFC, they are loaded at the 3' terminus of an Okazaki fragment of the leading strand. They adhere with a certain orientation, which allows MutSα/MutLα to cleave the nascent DNA strand, even though the gap around the Okazaki fragment has long been linked. As such, the MMR system has the correct directionality due to this nick generation.

DNA Repair by DNA ligase I



In degradation, EXO1 is loaded at the nicks created by the PCNA-activated MutSα/MutLα complex. This creates a large gap that starts the nick and ends around 150 bases after the mismatch. This gap is single-stranded and on the same side as the mismatch. EXO1 is an exonuclease that can only cut in a 5' to 3' direction.



Resynthesis involves PCNA, polymerase-δ, and DNA ligase I in order to replace the removed bases and, overall, fix the mismatch error.

EXO1 Independent Mechanism


Although not proven in humans, EXO1 deficient mice showed less mutations than MSH2 and MLH1 deficient mice, indicating a mismatch repair mechanism that does not require EXO1. Indeed, a 5' nick MMR mechanism could occur without EXO1 through use of polymerase-δ and MutSα, RPA, RFC, and PCNA. When there is a 5' nick from the mismatch error, polymerase-δ can catalyze strand displacement, whereby FEN1 can catalyze the removal of the strand containing the mismatch. DNA ligase I would then seal the nick formed.

Insertion/Deletion Loops and Trinucleotide Repeats


Insertion/Deletion loops (IDLs) and trinucleotide repeats (TNRs) interact largely with MMR in both error-preventing and error-propagating ways.

Origin of IDLs


IDLs arise due to the activity of polymerase on TNRs. Trinucleotide repeats are large number of repeats of a single tripley of nucleotides. Such repeats have been implicated in diseases such as Fragile X. When polymerase reads these repeats, it slows down. However, helicase does not slow down, and due to being relatively faster, there becomes long strands of single stranded DNA. As such, these strands can bunch up and form an IDL. This would cause polymerase to create shorter than usual DNA strands.

The Error-Preventing Role of MMR


When things such as this happen, MMR can work to fix it. If the loops is less than two to three extrahelical nucleotides long, the canonical MMR can fix it. However, if the loop is longer, there is a MutSβ-mediated way for loops to be fixed. However, this happens by some other MMR mechanism, for in the regular process, PCNA would not be able to diffuse past the large loop. Thus, there must be some non EXO1-mediated MMR.

The Error-Propagating Role of MMR


In certain cases, MMR may be "hijacked" to cause TNR expansion. In the event that there is a cruciform loop structure, where there are loops in both strands in the same relative position, a cleavage by PCNA attached by RFC onto one of the loops activating MutSβ and MutLα endonucleolytic activity may cause one of the loops to collapse. When polymerase replaces the missing nucleotides, there will be an extension of the trinucleotide repeat. A larger number of repeats has been linked to more severe disease in diseases (such as Huntington's Disease) that are caused by TNRs, and so this is an important field of study.

Antibody Variation and Class-Switching


Antibody variation, although due to a variety of reasons, is largely dependent on the role of MMR. After VDJ recombination, a process involving recombination of the variable, diversity, and join regions of the immunoglobulin genes, a variety of IgM antibodies can be made. However, there are further mechnanisms for antibody variability.

The role of MMR in Somatic Hypermutation


Somatic hypermutation (SHM) is a process whereby many mutations arise in the variable region of the antibody. It works through activation-induced cytidine deaminase (AID) where C nucleotides are converted to U. This occurs during transcription, because AID works best on single-stranded DNA. When this occurs, there is a mismatch error on the resultant DNA. As such, uracil DNA-glycosylase works to do base-excision repair (BER) and remove the incorrect U nucleotide. However, once this happens there is an apyrimidinic site (AP) remaining. This site can be the target of EXO1 in order for MMR to occur.

In this case, EXO1 cleavage may cause a large swath of DNA to be excised. When polymerase goes to fix it, AP sites and remaining uracil nucleotides may cause incorrect mutations at the sites where AID acts. This would result in changes in the variable site of the antibody and ultimately, different antigen recognition.

The role of MMR in Class-Switch Recombination


MMR can also cause the type of antibody to change, such as from IgG to IgM while recognizing the same type of antigen. When there are two AP sites and EXO1 causes the excision of a section of DNA to the other gap, class-switch recombination (CSR) can occur due to a double-strand break.


  1. Peña-Diaz, J., & Jiricny, J. (2012). Mammalian mismatch repair: error-free or error-prone? Trends in biochemical sciences, 37(5), 206–14. doi:10.1016/j.tibs.2012.03.001
  2. Zhao, J. et al. (2009) Mismatch repair and nucleotide excision repair proteins cooperate in the recognition of DNA interstrand crosslinks. Nucleic Acids Res. 37, 4420-4429
  3. Lopez Castel, A. et al (2010) Repeat instability as the basis for human diseases and as a potential target for therapy. Nat. Rev. Mol. Cell Biol. 11, 165-170.



DNA strand breaks are often caused by internal and external factors. After the termini of these strands break, they require processing before missing nucleotides can be replaced by DNA polymerase and its strands rejoined by DNA ligases. The enzyme polynucleotide kinase/phosphatase plays an important role in repairing DNA strand breaks by catalyzing the restoration of DNA’s termini. In addition to this, PNKP also helps in other DNA repair pathways through interactions with other DNA repair proteins such as XRCC1 and XRCC4. PNKP is important in maintaining genomic stability of normal tissues, like developing neural cells, and enhancing resistance of cancer cell to genotoxic therapeutic agents.

Polynucleotide kinase/phosphatase


When damage is done to cellular DNA, this causes aging, cancer etiology and treatment, and neurological disorders. DNA damage comes in different forms like: base modification and base loss and strand breaks. These damages can be triggered by intracellular agents like primary reactive oxygen species (ROS) and exogenous agents. In order to protect themselves from this damage, cells have evolved a battery of repair pathways. These counter mutational and cytotoxic consequences that occur due to DNA damage. Various mechanisms that cause strand breaks include: cleavage by physical and chemicals means such as ionizing radiation (IR) and ROS, and enzymatic processes. Therefore, strand breaks comes in a wide variety of forms and different strand breaks can be further classified or subdivided based on the nature of their termini. The enzyme PNKP carries 5’-kinase and 3’phosphatase activities that are essential for processing of single and double strand breaks at termini. Research into PNKP has shown that small molecule inhibitors of these enzymes sensitize cells to IR or chemotherapeutic agents. Researchers have also identified that mutations that have lead to changes in PNKP, similar to mutations in other genes that encode other strand break repair proteins, have been connected to a severe autosomal recessive neurological disorder.

Chemistry of strand break termini


IR-and free radical-induced breaks


Ionizing radiation (IR) causes strand breaks with a variety of end groups at 3’-termini by generating hydroxyl radicals. By generating hydroxyl radicals, reactions at different carbon atoms occur within the deoxyribose group to produce two predominant end groups: phosphate and phosphoglycolate. Phosphoglycolate formation is dependent on the presence of oxygen while 3’-phosphate groups are produced under normoxia and anoxia. On the other end, at the 5’-termini, the major end group is phosphate. In addition to causing strand breaks, ionizing radiation also generates complex lesions. These areas contain two or more damaged bases or strand breaks in close quarters and singly damaged sites. Complex lesions include frank DSBs with a ratio of SSB:DSB determined to be ~25:1. Another factor that causes strand breaks is hydrogen peroxide. Similar to IR-medicated damage, hydrogen peroxide causes far fewer frank DSBs. Bleomycin, a chemotherapeutic agent, additionally produces DSBs at the 3’-phosphoglycolate termini.

Camptothecin-induced breaks


The enzyme topoisomerase 1 creates a DNA cut with a 5’-OH terminus and a covalent 3’-phosphate-enzyme intermediate in order to relieve torsional strain. Using topoisomerase, camptothecin prevents resultant strand rejoining, leaving a DNA-enzyme ‘dead-end’ complex. By hydrolyzing this complex with tyrosyl-DNA phosphodiesterase, more cuts with 3’-phosphate and 5’-OH termini are made.

Repair-endonuclease induced breaks


Using DNA glycosylases, damaged bases can be removed. The abasic sites are then cleaved by one of two classes of enzymes. One of the enzymes, AP endonuclease, hydrolyses the phosphodiester bond 3’ to the abasic site in order to give 3’-OH and 5’-deoxyribose phosphate termini. By using DNA polymerase β, 5’-deoxyribose phosphate termini can be converted to 5’-phosphate. AP lyase works by cleaving the phosphodiester bond 5’ to the abasic site by a β-elimination reaction to give a β-unsaturated aldehyde attached to 3’-phosphate at one terminus and a 5’-phosphate at the other. Since many DNA glycosylases have this enzyme activity, the pentenal moiety can then be eliminated by an AP endonuclease to give 3’-OH or by an AP lyase to give 3’-phosphate. Enzymes NEIL1 and NEIL2, mammalian DNA glycosylases with β,δ-lyase activity, remove an extensive amount of mutagenic and cytotoxic oxidative pyrimidien lesions and purine-derived formamidopyrimidines.

Molecular architecture of PNKP


PNKP is constituted as a multidomain enzyme. It consists of 2 domains: an N-terminal forkhead-associated (FHA) domain and a C-terminal catalytic domain that is composed of fused phosphatase and kinase subdomains. Using a flexible polypeptide segment, the two domains, FHA and catalytic domain are linked together. This flexible polypeptide segment acts to selectively bind acidic casein kinase 2 (CK2)-phosphorylated regions in XRCC1 and XRCC4. XRcc1 and XRCC4 are important scaffolding proteins that repair DNA SSBs and DSBs. Aprataxin and APLF are DNA repair factors that also include FHA domains that likewise bind CK2-phosphorylated XRCC1 and XRCC4. This function could result in coordinated regulation of these proteins leading to binding of the phosphorylated scaffolding factors. PNKP and T4 polynucleotide kinases are similar in their catalytic domain in that they both contain contiguous kinase and phosphatase domains but different in that T4 enzyme lacks a FHA domain and that the kinase subdomain lies N-terminal to the phosphatasesubdomian.

The two catalytic active sites are positioned on the same side of the protein of murine PNKP. Murine and T4 kinase subdomain share resembling structure of a bipartite active site cleft that ahs separate ATP and DNA binding sites. The structure of the ATP binding site includes Walker A (P-loop) and B motifs conserved in various kinases. In addition it also carries aspartic acid that activates the 5’-hydroxly for attack on the ATP γ-phosphate. DNA binding sites between mammalian and phage enzymes are different. While phage PNK DNA binding cleft forms a narrow channel that leads to the conserved catalystic aspartic acid residue that accommodates single-stranded substrates, mammalian enzymes phosphorylates 5’hydroxyl termini within cut, gapped or DSBs with single-stranded 3’ overhanging ends since single-stranded 5’ termini are phosphorylated less efficiently. A broad DNA recognition grove composed of two distinct positively charged surfaces, selectively recognizes larger, double-stranded DNA substrates. By using structural information from small angle X-ray scattering experiments coupled with the effect of amino acid substitutions on surfaces of kinase, researchres found that DNA substrates bind across these surfaces in a defined orientation.

A typical process employed by many phosphatases is the haloacid dehalogenase fold. Mechanisms employed by these enzymes are dependent on Mg2+ while proceeding by a catalytic aspartate and acyl-phosphate intermediate. Mammalian PNKP executes its processes on a multitude of 3’-phoshate ends like those within nick,s gaps, DSBs, and single-stranded termini. Two narrow channels that are surrounded by large positively charged loops make a pathway to the phosphatase active site but aren’t wide enough to take in double-stranded substrates. This shows that either a requirement for remodeling of the phosphatase substrate binding surface or an unwinding of the DNA is needed to accommodate double-stranded substrates.

DNA repairing scaffold proteins, XRCC1 and XRCC4, interacts with PNKP function, mediated by binding of the PNKP FHA domain to phosphorylated motifs on XRCC1 and XRCC4. FHA domains, phospho-peptide binding modules, have a β-sandwich fold where a series of loops jut out from one side of the β-sandwich and provide a peptide binding surface with a marked preference for targets that contain a phospho-threonine residue. Even though XRCC1 and XRCC4 are structurally unrelated, they share similar motifs that are phosphorylated by CK2 and act as the binding sites for the PNKP FHA domain. A significant reduction in the efficiency of SSB repair occurs when a cluster of CK2 phosphorylation sites between residues 515 and 526 in XRCC1 is needed for interaction with PNKP and amino acid substitutions within this certain region. Similarly, a primary CK2 site in XRCC4, THr233, is needed for PNKP binding and for efficient repair of DSBs in vivo. Significant conservation of sequence is show around these sites. Phosphorylation of a conserved serine occurs and structure of the complex with regard to the primary phospho-threonine reveals a dynamic interaction of this residue with ARG35 or ARg44 of PNKP FHA domain. Tyrosine residue is conserved at the -4 position and asparagines residue is conserved at the +3 position. Some reactions aren’t conserved in the complex with XRCC4 in the FHA domain. The +3 position residue is a glutamic acid. Due to the peptides acidic properties and long-range electrostatic interactions between residues, the largely positively charged peptide-binding surface contributes to binding specificity. Threonine phosphorylation in the +4 position also plays a role to binding selectivity through the recruitment of a second PNKP FHA domain.

PNKP and single-strand break repair (SSBR)


The multienzyme pathway, SSBR, uses different participants depending on the causative agent. An example would be with IR-induced strand breaks that involves losing at least one nucleotide. The process of damage recognition and correction of the strand that is broken at the termini is carried out by enzymes poly(ADP-ribose)polymerase (PARP), XRCC1, AP endonuclease 1 and PNKP, with other proteins acting as backups to this functionality. By using a short patch pathway that involves DNA polymerase β and DNA ligase III or a long patch pathway that uses DNA polymerase δ and/or ɛ, the FEN1 endonuclease and DNA ligase I, consecutive replacement of nucleotides and strand resealing may occur. When IR occurs, APE1 removes 3′-phosphoglycolates while PNKP hydrolyses 3′-phosphate groups. This occurs when 3′-phosphatase activity of APE1 is much weaker than that of PNKP. Enzyme PNKP also plays a role in confirming that 5′-OH termini are phosphorylated. Due to the fact that phosphatase activity of PNKP is much more active than the kinase activity, when strand breaks with both 3′-phosphate and 5′-OH termini occur, the activity of PNKP is prioritized. Phosphatase activity in PNKP was shown to be important in the rapid repair of hydrogen peroxide-induced SSBs in mammalian cells when a failure of overexpression of phosphatase-defective PNIKP to compensate for Xrcc1defieicney occurred. Correspondingly, another important factor of PNKP phosphate activity involves a small molecule inhibitor that dramatically retards SSBR in irradiated human cells. While it is shown here that important phosphatase activity exists in PNP, the physiological important of the 5′-kinase activity has yet to be determined.

A commonly accepted model for repair of radiation-induced SSBS is when SSB catalyzes the polymerization of chains of ADP-ribose onto acceptor chromatin proteins and itself. BY doing this, SSBR attracts the scaffold protein, XRCC1 and maybe also tightly bound DNA ligase III. The proteins then in turn recruits PNKP or APE1 in order to restore the essential terminal groups for DNA polymerase β so that it can add the missing base and allow DNA ligase III to rejoin the strand. By researching and analyzing protein-protein interactions, it was found that direct interactions between XRCC1 and PNKP exist, as well as with DNA polymerase β and DNA ligase III. This shows that these connected partnerships include tetrameric complex between the four proteins. This formation could form for various models. While there is evidence that shows interactions between XRCC1 and PNKP, evidence also exists that counters the concept that XRCC1 recruits either PNKP or APE1 to the strand break. By using the technique of cross linking proteins to DNA substrates, experiments were conducted to track the temporal association of SSBR proteins in HeLa cell. Through this process of incubation, it was discovered that for substrates with either 3′-phosphoglycolate termini or 3′-phosphate termini, APE1 and PNKP, were recruited to the strand breaks before XRCC1/DNA ligase III. In addition to this discovery, it was found that immunodepletion of APE1 or PNKP diminished the binding of XRCC1 to the following substrates. This indicated that APE1 and PNKP inducted XRCC1 to sites of oxidative damage rather than in reverse. Conversely, PNKP foci were found to be in the nuclei of hydrogen peroxide-treated cells expressing XRCC1, but did not exist in cells lacking XRCC1. This shows that although XRCC1 might not be required in the beginning stages of PNKP or APE1, it expedites the focal accumulation and provocation of these specific enzymes at sites of chromosomal damage

Even though DNA repair protein XRCC1 lacks inherent enzymatic activity, it has the ability to enhance both kinase and phosphatase activities of PNKP. By using florescence measurements to work out the binding mechanism between PNKP and substrates that mimic different strand breaks, the mechanism surrounding XRCC1-induced stimulation was discovered. Even though PNKP bounded tightly to a nicked substrate with a 5′-OH terminus with a Kd value of 0.25 μM, this was only 5- to 6-fold tighter than PNKP binding to the identical duplex bearing a 5′-phosphate. This showed that PNKP stayed bounded to the product of its kinase activity. Results showed that the presence of XRCC1 did not influence the binding of PNKP to the nonphosphorylated substrate. But further results also showed that PNKP interaction with the phosphorylated duplex was abolished thus indicating that XRCC1 did influence the binding and displaced PNKP from the reaction product. By following the evidence of kinetics of product accumulation under limiting enzyme concentration, the result of the addition of XRCC1 increasing PNKP enzymatic turnover was confirmed. Further data has shown that similar kinetic data was observed for PNPK phosphatase activity.

The relationship between PNKP and XRCC1 is further complicated by CK2-mediated phosphorylation of XRCC1. While promoting interaction with other proteins, XRCC1 phosphorylation also works to stabilize the XRCC1-DNA ligase III complex. Observations were found of multiple sites of CK2-mediated XRCC1 phosphorylation involved in vitro, clustered within specific locations. In order to recruit XRCC1 and PNKP to nuclear foci in hydrogen peroxide-treated or γ-irradiated cells, XRCC1 phosphorylation is needed. XRCC1 phosphorylation is also needed to promote more rapid repair of SSBs. If a cell lacked XRCC1 phosphorylation, this would not impact cell survival. But through further research and analysis, it was found that cells without function XRCC1 with triple mutant XRCC1 would fail to fully restore rapid SSBR, showing that there indeed existed an important interaction with PNKP. Repair of the cell could easily be completed by overexpression of PNKP. This shows that XRCC1 plays an important role in increasing PNKP enzyme turnover, especially when the cell contains a limiting concentration of PNKP.

Phosphorylation of XRCC1 by CK2, compared to nonphosphorylated XRCC1, prompts the kinase and phosphatase activities of PNKP that are measured in vitro. In contrast, Stimulation by nonphosphorylated XRCC1 is due to enhanced enzymatic turnover of PNKP. This situation brings up problems since it can be seen that phosphorylated and nonphosphorylated XRCC1 bind PNKP at different site and with different affinities, but both are able to stimulate PNKP by a similar mechanism. Research found that while phosphorylated XRCC1 binds the FHA domain with a Kdvalue of 4 nM, the nonphosphorylated protein binds the catalytic domain of PNKP with a 10-fold weaker affinity. This indicates that a certain possibility of phosphorylation-independent interaction between PNKP and XRCC1 in human cells exists. Researchers found that PNKP co-immunoprecipitated with XRCC1 triple mutant that was expressed in human 293T cells. While 85–90% of the cellular XRCC1 is phosphorylated, this does not indicate that the key cluster of amino acids involved in interaction with the FHA domain is fully phosphorylated. An increase in phosphorylation at the cluster and an approximately 3-fold increase in PNKP copurifying with XRCC1 was due to treatment of cells with hydrogen peroxide. This shows that cells might play a role in enhancing CK2-mediated phosphorylation of XRCC1 and its subsequent interaction with PNKP FHA domain. This enhancement happens directly in response to a confrontation by hydrogen peroxide or radiation to deal with rather high levels of DNA damage in an efficient manner. On the opposite end of the spectrum, unstressed cells are able to cope with comparatively low level of endogenous DNA damage by using a different method. By using nonphosphorylated XRCC1, or XRCC1 with a restricted degree of phosphorylation, it is able to activate PNKP through binding to the catalytic domain.

Cells are sensitive to camptothecin due to PNKP depletion in its cells and Pnk1 deletion in fission yeast. XRCC1 overlooks the repair of these strand breaks by forming a complex with TDP1, DNA ligase III and PNKP. Neurodegenerative disorder, spinocerebellar ataxia with axonal neuropathy-1, is caused by mutation of TDP1. Research shows that SCAN 1 cells have a reduced capacity to repair Camptothecin-induced SSBs and also display slow repair of hydrogen peroxide-induced SSBs. This evidence proffers that TDP1 is important and required to repair lesions generated by oxidative processes, lesions that possibly justify neurodegeneration observed in SCAN1. Evidence for this was shown by experiments for fission yeast in G0, Tdp1 and Pnk1 that act sequentially in order to process the 3′-termini of naturally occurring SSBs .

PNKP and base excision repair (BER)


Cellular mechanism BER, base excision repair, is accountable for the repair of most minor base modifications determined by IR, ROS and alkylating agents. First step in the mechanism is to remove the modified base by DNA glycosylases and then cleave the DNA at the newly formed apurinic/apyrimidinic (AO) site using APE1. Another way would be to use glycosylases hydrolyze the AP site with its AP lyase activity. With the discovery of the nei endonuclease endonuclease VIII-like-1 (NEIL1) and NEIL2 mammalian DNA glycosylases, it was indisputable that PNKP was involved in the BER pathway. Nei endonuclease VIII-like-1 (NEIL1) and NEIL2 mammalian DNA glycosylases possess β,δ-AP lyase activity that generates 3′-phosphate termini. Instead of binding directly to PNKP, these glycosylases instead are associated with larger complexes that contain other BER components that include PNKP. The function of these glycosylases are to undertake a variety of base lesions that include: thymine glycol, 5-hydroxyuracil and 8-oxoguanine . In addition to this function, glycosylases can also cleave intact abasic sites that are generated by glycosylases that do not possess AP lyase activity, and the pentenal moiety generated by the β-elimination AP lyases of other DNA glycosylases. Because of this NEIL glycosylases would compete with APE1 thus forming the basis of a different, APE1-independent, BER pathway. Although current research can not indicate to what extent NEIL1- or NEIL2-catalyzed cleavage of abasic sites arises in cells, the cleavage of these sites could possibly explain for the increased sensitivity of PNKP-depleted cells to the alkylating agent methyl methanesulfonate (MMS). This sensitivity to MMS came as a surprise in the experiments due to major lesions inflicted by this agent being N7-methylguanine and N3-methyladenine, with little if at all any direct strand scission . Downregulating aprataxin expression also causes cells to be sensitive to MMS. But since human DNA glycosylase that are responsible for removing these methylated bases do not possess AP lyase activity, the ability to act upon the abasic sites generated by MPG to produce strand breaks with 3′-phosphate termini must fall to NEIL1 or NEIL2.

PNKP and double-strand break repair (DSBR)


In the two major double-strand break repair pathways, there is proof for PNKP participating in nonhomologous end joining. But in contrast, due to its failure to influence IR-induced sister chromatic exchange by PNKP deletion, this suggests that PNKP may actually not be involved in homologous recombination. In addition to the other pathways, PNKP plays an additional role as a back-up, XRCC1-dependent, DSB repair pathway. Experiments showed evidence for PNKP participation through using human cell-free extracts. This evidence showed that PNKP kinase activity was required before binding of linearized plasmid substrates bearing 5′-OH termini could happen. XRCC4 and DNA-PK were important in determining how successful phosphorylation was. In parallel to the role of XRCC1 linking PNKP to DNA ligase III, XRCC4 links PNKP to DNA ligase IV. CK2-mediated phosphorylation of XRCC4 Thr233 plays a role in interacting with the PNKP FHA domain and smoothly stimulating XRCC4–DNA ligase IV mediated ligation of a 5′-dephosphorylated plasmid substrate in vitro. In an Xrcc4-deficient cell line, when expression of XRCC4 occurs instead of wild-type XRCC4, the rate of survival is reduced by approximately 30% following irradiation and thus slowing down the rate of DSB repair.

The function role of the XRCC4-PNKP interaction was able to be determined by coming biophysical and biochemical examination. While phosphorylation of XRCC4 advocates a tight affinity for PNKP, nonphosphorylated XRCC4 also have the ability to bind to PNKP. Though in this particular case, binding is to the catalytic domain of PNKP thus weakening the affinity. Similar to the ability of XRCC1 stimulation of PNKP turnover from SSBs, nonphosphorylated XRCC4 has the ability to stimulate pNKP enzymatic turnover from DSBs. Research found that the presence of phosphorylated XRCC4 failed to stimulate PNKP and thus did block PNKP-mediated DNA phosphorylation. But with the additional attendance DNA ligase IV, the complex it forms with phosphorylated XRCC4 has the ability to reverses the inhibition and stimulate PNKP turnover. A ratio of XRCC4:DNA ligase IV:PNKP of ∼7:1:3 was found in the proteins in HeLA cells, with almost half of the XRCC4 vitally phosphorylated at Thr233. This shows that in cells, only a fraction of XRCC4 can be complexed to DNA ligase IV thus indicating a possibility for FHA-independent interaction between XRCC4 and PNKP. Using XRCC4 co-immunoprecipitation with PNKP, the FHA independent interaction between XRCC4 and PNKP was confirmed for expression in cells depleted of endogenous PNKP. PNKP also has an important function of processing DSB 3′-phosphoglycolate termini, especially 3′-overhanging and blunt-ended termini. These termini are produced by IR, bleomycin and enediyne compounds like neocarzinostatin. Even though APE1 has the ability to remove phosphoglycolate groups at SSB termini and recessed DSB termini, with blunt-ended DSB termini it loses its effectiveness and with overhanging termini it is completely ineffective.

Physiological roles and clinical potential of PNKP


PNKP is involved in several DNA repair pathways that work to protect cells from endogenous and exogenous genotoxic agents. Neurological disorders with various symptoms occur when disruption of NHEU genes and SSBR/BER genes occur. An example would be microcephaly. Microcephaly occurs in people with mutations in LIG4 that encodes DNA ligase IV. Deletion of Xrcc1 in mice causes seizures. Research has found that PNKP mutations are the cause of a sever neurological autosomal recessive disease that is characterized by microcephaly. Symptoms include intractable seizures and developmental delay. Through analysis of families, mutations were found in both the kinase and phosphatase domains. Through the collection of all the symptoms shown by patients with MCSZ, it shows the involvement of PNKP in multiple DNA repair pathways.

PNKP has also shown to be linked to pathophysiological conditions. It has been observed that elevated expression of PNP in arthrofibrotic tissue shows a role for PNKP in mitigating the effects of ROS generated by macrophages. It has also been observed in another experiment that physiologically and environmentally relevant doses of cadmium and copper are known to elicit neurotoxic and carcinogenic effects, thus inhibiting PNKP.

The concept of DNA repair capacity of tumor cells shows an important point in clinical response to many antineoplastic agents. Thus investigations are underway of inhibitors of several DNA repair enzymes like PNKP. They hold on to the ability to sensitize cells to radiation and chemotherapeutic drugs thus showing an important concept for research. Through this research, a small molecule inhibitor of PNKP phosphatase activity was identified and exhibited to heighten the sensitivity of cells to IR and camptothecin. This is the parent compound of two clinically important topoisomerase I poisons, irinotecan and topotecan, that are frequently used to treat colon and ovarian cancers.



PNKP is an important enzyme that is used in cellular processing of strand break termini. PNKP is involved in many DNA repair pathways due to its helpful properties. More research is needed to identify how it is regulated, how it collaborates with other repair enzymes, and physiological role in neurons and other tissues. PNKP is seen as a therapeutic target in treatment of cancer since it is involved in a variety of repair pathways. Therefore, new inhibitory compounds will need to be identified, researched, and optimized for clinical use. Further research should be invested in identifying synthetic lethal partners of PNKP in order to view its potential use as single agents against tumors deficient in proteins.



Weinfeld, Michael. "Tidying up loose ends: the role of polynucleotide kinase/phosphatase in DNA strand break repair." Trends in Biochemical Sciences 36.5 (2011): 262-71. PubMed. Web. 21 Nov. 2012.

DNA Packaging


DNA packaging is an important process in living cells. Without it, a cell is not able to accommodate large amount of DNA that is stored inside. For example, a bacterial cell which ranges from 1 to 2um in length contains amount of DNA that is 400 times as big (Becker et al. 530). Eukaryotic cells face even bigger challenges. A typical human cell has enough “DNA to wrap around the cell more than 15,000 times” (531). Therefore, DNA packaging is crucial because it makes sure that those excessive DNA are able to fit nicely in a cell that is many times smaller.

The DNA in bacterial cells are either circular or linear. To accommodate the size of bacterial cell, supercoiled DNA are folded into loops with each loop resembles shape of bead-like packets containing small basic proteins that is analogous to histone found in Eukaryotes (533).

In eukaryotic cells, DNA packaging is more complicated because they contain amount of DNA that is much larger than that of bacterial cells. More proteins are therefore required for the process with histone being the most important one. This protein is consisted largely of positive amino acids like lysine and arginine which make the overall structure positive. Thus, histone interacts favorably with the negative phosphate groups from DNA. There are five main types of histone, H1, H2A, H2B, H3 and H4 (533). Two of each H2A, H2B, H3 and H4 joins to form an octamer wrapped around by DNA of 146 base pairs like a bead on a string. This bead, consisting of eight histone molecules and 146 DNA base pairs, is known as the nucleosome. Each nucleosome is connected by a DNA linker of 50 base pairs to form a fiber like structure called chromatin. H1 is believed to be found in these DNA linkers. Chromatin fibers can be further compacted to form higher order of structures called heterochromatin or euchromatin depending on the degree of packing. Ultimately, DNA packaging in eukaryotic cells can lead to the formation of chromosome which is only present during cell division or several other situations (533-535). In eukaryotic cells, DNA packaging is not only in the nucleus but is also in mitochondria and chloroplast. The overall shape of their DNA resembles that of bacteria instead that of eukaryotes.

Histone chaperones and the nucleosome assembly processes


Histones are proteins that allow DNA to be tightly packaged into units called nucleosomes. The DNA wraps itself around the histones.

Chromatin is made of DNA and proteins (Histones). Chromatin is used to give structure to a chromosome.

Nucleosome consists of the acidic chromatin and the basic histone proteins.

Histone chaperones Histone chaperone guided folding pathways, assists in the folding and unfolding of the DNA around the histone.

Organization The tight coiling of DNA allows easier access to the DNA which makes sequencing faster.

Need for histone chaperones Nucleosomes can be assembled or disassembled and are done in stepwise function. Histone chaperones guide the pathway process, they control and regulate.

Structural forms of histone chaperones Since histone chaperones participate at each step of the nucleosome assembly processes, there are different chaperones needed for each different step.[1]



Becker, Wayne M, et al. The World of the Cell. 7th ed. New York: Pearson/Benjamin Cummings, 2009. Print.

Churchill, Das, Tyler The histone shuffle: histone chaperones in an energetic dance Structural Biochemistry/DNA as nanomaterial Structural Biochemistry/Structural DNA nanotechnology Structural Biochemistry/Holliday junction Structural Biochemistry/Proliferative and Antiproliferative genes Structural Biochemistry/Protein-DNA recognition Structural Biochemistry/Transcription Regulation by mediator Structural Biochemistry/Chromatin and aging



Saccharomyces cerevisia (Sir2) is an NAD+-dependent histone deacetylase. It's role within the cell is to link chromatin silencing to genomic stability, cellular metabolism, and lifespan regulation. For example, in mice, if there is a deficiency for SIRT6 (family member of Sir2), the mice experience genomic instability, metabolic defects, and degenerative pathologies in terms of aging, everything opposite of the roles of Sir2. With new insights to the previously ambiguous SIRT6, scientists have discovered that SIRT6 is a very substrate-specific histone deacetylase that promotes proper chromatin function in things like telomere stabilization and DNA repair.

Sir2: a chromatin-aging connection


Sir2 is the founding member of the family of proteins called sirtuins. These proteins provided the first link between chromatin regulation and aging. Sir2 favors chromatin silencing at sub-telomeric DNA, silent mating-type loci, and rDNA repeats. These effects of Sir2 on chromatin is mediated by having Sir2 catalyzing the deacetylation of lysine residues on the amino terminal ends of histones H3 and H4 and also on the globular core of histone (all by NAD+-dependent histone deacetylase activity). Deacetylation of H4 lysine 16 and H3K56 mediate the silencing effects of Sir2.
Example: In budding yeast, Sir2 regulates replicative lifespan through a couple of chromatin-silencing processes.

First, Sir2 suppresses recombination between rDNA repeats and this prevents
Second, H4K16 acetylation levels increase at telomeres when replicative age increases; Thus, Sir2 protein levels decrease. These chromatin changes create defects in telomere position-dependent transcriptional silencing and trigger replicative senescence.[2]

A few studies have shown that there are aging-related Sir2 functions that might be chromatin-independent, making the relationship between Sir2 and lifespan regulation even more complex.
For example,Sir2 asymmetrically segregates damaged proteins to the yeast mother cell during cell division; this asymmetry can age the mother cell by forming toxic protein aggregates. Also, Sir2 can block lifespan extension in response to nutrient deprivation of mutations in nutrient-sensing pathways.

Mammalian sirtuin proteins: venturing out from chromatin


SIRT1 is the most closely related to year Sir2 out of the seven SIR2 family members. However, Sir2 appears to deacetylate histones exclusively while SIRT1 appears to more than 40 substrates. SIRT1 deacetylates many non-histone proteins and impacts on many phsysiologic processes like apoptosis.
SIRT1, SIRT6, and SIRT7 are concentrated in different sub-nuclear patterns; SIRT2 is cytoplasmic; SIRT3, SIRT4, and SIRT5 reside in the mitochondria.

SIRTching for a function through knockout mice


SIRT6-deficient mice appear normal when born, but after a couple of weeks, they start to develop degenerative phenotypes like osteoporosis. They also experience metabolic defects - so much that with such low levels of the insulin-like protein IFG-1, these mice die by 1 month.

An orphan enzyme finds its substrates


Through experiments, it was found in vitro that SIRT6 promotes mono-ADP-ribosylation, an alternative NAD+-depdendent reaction in sirtuins. Another breakthrough occurred to further understand SIRT6 function through discovery of the enzymatic activity and the first substrate of SIRT6: NAD+-depdendent deacetylation of histone H3 lysine 9. SIRT6 specifically deacetylates H3K9, but lacks activity on a lot of other histone tail residues due to its intense specificity.
Two groups were identified independently as the second substrate for SIRT6: lysine 56 of histone H3 (H3K56Ac).

To the core and beyond: biochemical dissection of SIRT6 function


Sirtuin proteins have a conserved central "sirtuin domain" flanked by N- and C- terminal extensions. The sirtuin domain supposedly has an enzymatic core and understanding this domain can show scientists the physiologic regulation of sirtuin proteins.
For SIRT6, a recent study showed that the N- and C- terminal domains regulate SIRT6 function by having the C terminus require proper nuclear localization (but is dispensable for enzymatic activity) and then the N terminus is beneficial for chromatic association and intrinsic catalytic activity.
Why is catalytic activity required for chromatin association in the cell?

It could be possible that histone deacetylation by SIRT6 might be able to stabilize SIRT6 availability at chromatin or it can promote propagation of SIRT6 molecules along chromatin.

At the ends of chromosomes: SIRT6 regulates telomeric chromatin


SIRT6 plays an important role in the chromatin-regulatory context by keeping the integrity of telomeric chromatin stable. Telomeres are specialized DNA-protein structures which protect chromosome ends that are linear from degradation and fusion. SIRT6 plays a huge role at telomeres in humans for a couple of reasons:

First, telomere structures need to be correct in order to maintain genomic stability; chromosomal instability is apparent in cancer cells.
Also, telomere length decreases with cellular age. This shows that SIRT6's role at telomeres correlates with aging.



With many experiments and discoveries, SIRT6 has been determined as a site-specific histone deacetylase, playing very important roles in keeping up telomere integrity, honing aging-associated gene expression programs, preventing the genome to become unstable, and maintaining metabolic homeostasis.
Not only does SIRT6 function at specific sites in the genome, it plays a role in binding to additional gene promotors. Also, there might be interactions between SIRT6 and other sirtuin proteins.
Lastly, SIRT6 might have an impact on cancer due to the fact that there have been links between SIRT6 and cancer by the SIRT6 chromosomal locus.



Tennen, Ruth I., and Katrin F. Chua. "Chromatin regulation and genome maintenance by mammalian SIRT6." Trends in Biochemical Sciences 36.1 (2011) 39-46. Academic Search Complete. Web. 05 December. 2012. RNA is also known as ribonucleic acid. It is a part of most living organisms as well as viruses. It contains bases of Adenine, Cytosine, Guanine, and Uracil (instead of Thymine) which all bind to the ribose. RNA can be used to make DNA as well as synthesize proteins. It is the only polymer that can serve as a catalyst to the formation of proteins as well as storing genetic information. The RNA backbone is made of alternating ribose-phosphate groups. RNA can be found usually single stranded in humans, but can appear double stranded in many other organisms, including viruses.

Some viruses have RNA as their primary genetic material. They are known as RNA viruses. These viruses infect cells by first binding to a specific protein or receptor on the surface of the cell. After binding to the cell's surface, the virus injects its genetic material, or RNA, into the cell. The viral RNA, then, associates with the ribosomes of the infected cell. Essentially, a virus seizes control of its host's molecular machinery, uses the host cell's transcriptional abilities to produce viral proteins. The newly-made viral proteins then go on to produce new viruses. Furthermore, viral RNA can form replication complexes where it can copy itself. This newly-replicated RNA then gets packaged into the newly created viruses, which leads the cell to lyse, or break open. Consequently, these released viruses can go on to infect other cells.

RNA is nucleic acid, and its single-stranded, helical structure is constructed by nucleotides of nitrogenous bases, ribose sugar, and phosphate group; the bases are adenine, guanine, cytosine, and uracil, for which, 1’ nitrogen of pyrimidine base and 9’ nitrogen of purines base are bonded to 1’carbon of pentose sugar by glycosidic bond; base pairs of adenine and uracil and of cytosine and guanine are bonded by hydrogen bonds; the ribose is a pentose sugar of carbon numbered from 1’ to 5’ and has a hydroxyl group on the 2’ carbon; the 3’ and 5’ carbons of ribose sugar are bonded to phosphate group by phosphodiester bond; more importantly, the structure is of A-form geometry, which is constructed as of vast and thin major groove and of flat and broad minor groove, the structure can fold on itself to form secondary structure, such as tRNA and rRNA, and the secondary structure that are stabilized by hydrogen bonds, domains of loops, and metal ions, such as Mg 2+, form specific tertiary form.

Double Stranded RNA


Double Stranded RNAs, or dsRNA, are RNA's that have a complementary strand, similar to that of DNA. Many viruses are made from dsRNAs that infect a variety of hosts, ranging from animals, humans, fungi, plants, and bacteria. An RNA virus is a virus that contains only RNA as its genetic material, or whose genetic material passes through an RNA intermediate during replication. An example of a RNA virus is Hepatitis B, because even though it has a double-stranded DNA genome, the genome is transcribed into RNA during replication. An interesting fact about RNA viruses is that they have very high mutation rates since they lack DNA polymerases which is responsible for finding and editing mistakes. dsRNA's can also be synthetically produced by the process of in vitro and cloning using PCR to amplify the results. dsRNA's are responsible for the RNAi pathway.

Double strand RNA, dsRNA, is important because it helps regulating genes expression in eukaryotes cells. It triggers different gene silencing known as RNAi-Interfering RNA. Interfering RNA is a dsRNA that gets chopped off into a smaller fragments and binds to mRNA to block the gene expression. It also helps to reduce the production of gene’s encoded protein in order to get just right growth and reduce the self defense.



RNA is usually found in humans as a single stranded linear polymer. The monomeric units (nucleotides) linked together by 3'5' phosphodiester bridges. (A nucleoside is a ribose sugar connected to a base through the 1'C, while a nucleotide is a nucleoside plus a phosphate group connected to the 5'C of the sugar) The secondary structure of RNA is stabilized by Hydrogen bonds, intrastrand pairing of the bases (AU, GC) oftentimes resulting in structures such as hairpin loops. The stability of these loops depend on the number of unpaired bases in the loop, anything more than 10 or less than 5 is not very energetically favorable. There are oftentimes when the structure of RNA is not very stable because of the inability to match up Watson and Crick base pairs in the stem of the hair pin loops. Because it is single stranded, RNA will also fold into more complex structures, there are times when three nucleotides interact together to stabilize the structure. The Mg2+ stabilizes the structure when it is more elaborately structured. In these cases, there are often Hydrogen bond donors or acceptors that aren't already in Watson and Crick base pairs can interact and Hydrogen bond in 'irregular' pairing. Because of the extra hydroxyl group attached to the anomeric Carbon (the 2' Carbon), RNA is not as stable as DNA and will not form double helices as easily, although there have been cases of them found in some viruses. The 2' hydroxyl group on RNA also causes it to self hydrolyze. The hydroxyl group will attack the phosphorous which cleaves the phosophodiester bond on the 5' end. This instability also contributes to DNA being the preferred molecule for genetic storage in humans.

The technique of Northern blotting is often used to uncover the DNA sequence of a sample.



There are many different types of RNA, and they carry out different function in the cell.


Messenger RNA

Transcribes the DNA and is the template for the synthesis of protein. DNA + RNA polymerase makes mRNA.

Transfer RNA

Brings the activated amino acids from other parts of the cell to the site of translation, or the ribosome. tRNA reads the information in th emRNA and translates that to amino acid. In other words, it translates information from the RNA to proteins.

Ribosomal RNA

RNA that takes part in translating Messenger RNA into protein, constituent of ribosomes. rRNA is the most common and deals with the activity of the ribosome. rRNA deals with the formation of peptide bonds and is carried by this RNA in the ribosome.
Small interfering RNA
Bind to Messenger RNA and help them degrade.

Micro RNA

Small non-coding RNA that inhibit translation of their complementary mRNA.
small nuclear RNA
Responsible for the sorting of proteins by removal of the introns (splicing) from hnRNA as well as maintaining telomeres
Interference RNA
inhibition of gene expression by cutting up mRNA.
Structural insights into RNA interference.

The structures of these different types of RNA will vary depending on what they are supposed to do. The tertiary structure varies by function. Even in the simplest sense, some will be relatively long strands of nucleic acids, such as Messenger RNA up to 1.2 kilobases, while others are relatively short sequences of 21 nucleotides such as miRNA.


edit">RNA Function Viadiu, Hector. "Types of RNA." UCSD. Lecture. November 2012.

Messenger ribonucleic acid (mRNA) is the blueprint of protein reproduction. Transcribed from deoxyribonucleic acid (DNA), mRNA transfers genetic information from the cell nucleus into the protein-producing ribosomes located in the cytoplasm. Similar to DNA, the genetic information is encoded in four nucleotides that are arranged in codons, or triplets of nucleotide bases. Each codon corresponds to a specific amino acid, and the sequence of codons ends with a codon that has a stop signal. The protein synthesis process requires transfer RNA (tRNA) and ribosomal RNA (rRNA). mRNA makes up only about 5% of the different types of RNA found in both Prokaryotic and Eukaryotic cells.




During transcription, an RNA strand is copied by an enzyme, RNA polymerase. RNA is then synthesized in the 5' to 3' direction, as is also done in DNA replication. The template of the two DNA strands is the one in which the RNA is synthesized. RNA polymerase binds to the 3' end and replicates via phosphodiester bonds.

The obvious difference between DNA and mRNA in this stage is in the uracil (U) that is present in RNA instead of thymine (T) in DNA.

The RNA first transcribed from the DNA is known as pre-messenger RNA (pre-mRNA) since the exact copy of the DNA region contains both introns and exons. Messenger RNA contains only exons. Introns are removed via splicing by spliceosomes, which recognize intronic sequences based on a GU beginning, a long pyrimidine chain, and an AG ending. Only exons remains in mRNA mainly because it contains useful genetic information for translation - producing a protein. Introns, however, do not provide useful genetic information.

caps and PolyA tails are added as modification to protect the active ends of mRNA after transcription and before translation.



In eukaryotes, the product of transcription of a protein-coding gene is pre-mRNA which requires processing to generate functional mRNA. Several processing reactions occur.

5'processing: capping

Very soon after it has been synthesized by RNA polymerase II, the 5' end of the primary RNA transcript, pre-mRNA, is modified by the addition of a 5' cap(a process known as capping). This process involves the addition of 7-methylguanosine(m7G) to the 5'end. To achieve this, the terminal 5' phosphate is first removed by a phosphatase. Guanosyl transferase then catalyzed a reaction whereby the resulting diphosphate 5' end attacks the α phosphorus atom of a GTP molecule to add a G residue in an unusual 5'5' triphosphate link. The G residue is then methylated by a methyl transferase adding a methyl group to the N-7 position of the guanine ring, using S-adenosyl methionine as methyl donor. The ribose of the adjacent nucleotide (nucleotide 2 in the RNA chain) or the riboses of both nucleotides 2 and 3 may also be methylated to give cap 1 or cap 2 structures respectively. In these cases. the methyl groups are added to the 2'-OH groups of the ribose sugars.

The cap protects the 5' end of the primary transcript against attack by ribonucleases that have specificity for 3'5' phosphodiester bonds and so cannot hydrolyze the 5'5' bond in the cap structure. In addition, the cap plays a role in the initiation step of protein synthesis in eukaryotes. Only RNA transcripts from eukaryotic protein-coding genes become capped; prokaryotic mRNA and eukaryotic rRNA and tRNAs are uncapped.


RNA splicing is a key step in RNA processing because it precisely remove the intron sequences and join the ends of neighboring exons to produce a functional mRNA molecule. The exon-intron boundaries are marked by specific sequences. In most cases, at the 5' boundary between the exon and the intron(the 5' splice site), the intron starts with the sequence GU and at the 3'exon-intron boundary (the 3' splice site) the intron ends with the sequence AG. Each of these two sequences lies within a longer consensus sequence. A polypyrimidine tract (a conserved stretch of about 11 pyrimidines) lies upstream of the AG at the 3' splice site. A key signal sequence is the branchpoint sequence which is located about 20-50 nt upstream of the 3' splice site. In vertebrates this sequence is 5'-CURAY-3' where R=purine and Y=pyrimidine (in yeast this sequence is 5'-UACUAAC-3'). RNA splicing occurs in two steps. In the first step, the 2'-OH of the A residue at the branch site attacks the 3'5' phosphodiester bond at the 5' splice site causing that bond to break and the 5' end of the intron to loop round and form an unusual 2'5' bond with the A residue in the branchpoint sequence. Because this A residue already has 3'5' bonds with its neighbors in the RNA chain, the intron becomes branched at this point to form what is known as a lariat intermediate (named as such since it resembles a cowboy's lasso). The new 3'-OH end of exon 1 now attacks the phosphodiester bond at the 3' splice site causing the two exons to join and release the intron, still as a lariat. In each of the two splicing reacitons, one phophate-ester bond is exchanged for another (i.e. these are two transesterification reactions). Since the number of phosphate-ester bond is unchanged, no ATP is consumed.

3' processing:cleavage and polyadenylation

A majority of eukaryotic pre-mRNAs undergo polyadenylation which involves cleavage of the RNA at its 3' end and the addition of about 200A residues to form a poly(A)tail. The cleavage and polyadenylation reactions require the existence of a polyadenylation signal sequence (5'-AAUAAA-3') located near the 3' end of the pre-mRNA followed by a sequence 5'-YA-3' (where Y=a pyrimidine), often 5'-CA-3', in the next 11-20 nt. A GU-rich sequence (or U-rich sequence) is also usually present further downstream. After these sequence elements have been synthesized, two multisubunit proteins called CPSF (cleavage and polyadenylation specificity factor) and CStF (cleavage stimulation factor F) aretransferred from the CTD of RNA polymerase II to the RNA molecule and bind to the sequence elements. A protein complex is formed which includes additional cleavage factors and an enzyme called poly(A) polymerase (PAP). This complex cleaves the RNA between the AAUAAA sequence and the GU-rich sequence. Poly(A) polymerase then adds about 200A residues to the new 3' end of the RNA molecule using ATP as precursor. As it is made, the poly(A) tail protects the 3' end of the final mRNA against ribonuclease digestion and hence stabilizes the mRNA. In addition, it increases the efficiency of translation of the mRNA. However, some mRNAs, notably histone pre-mRNAs, lack a poly(A) tail. Nevertheless, histone pre-mRNA is still subject to 3' processing. It is cleaved near the 3' end by a protein complex that recognizes specific signals, one of which is a stem-loop structure, to generate the 3'end of the mature mRNA molecule.

The primary RNA transcript that continues to be synthesized includes both coding(exon) and noncoding(intron) regions. The latter need to be removed and the exon



In Eukaryotic cells, following synthesis, mRNA typically goes through a series of modifications before being exported to the cytoplasm for translation. These modifications include a 5’ guanine capping and a polyadenylation at the 3' end. This strand of Adenine residues (anywhere from 80-250) is called the Poly-A tail and is needed for the export, protection, translation, and stability of the mRNA. Splicing, the process in which introns are removed and exons are joined, also occurs before exportation.

After all the proper modifications have been carried out, the mature mRNAs are ready to be exported through the nuclear pore into the cytoplasm. Nuclear pores are the channels between the nucleus and cytoplasm, and is a selective barrier that allow macromolecule transportation. Alternate splicing patterns of introns allows the same gene to express in a slightly different way in mRNA creating a different, but similar protein. In order for the mature mRNAs to be carried out, first, the formation of the messenger ribonucleoprotein (mRNPs) export complex with RNA binding proteins and transport factors (carriers) must occur since Mex67-Mtr2 heterodimer, the principal mRNA carrier, binds loosely to bulk mRNA.

Nuclear Transport

Summary of mRNA nuclear export

Nuclear export is a pathway unique to eukaryotic cells because the nuclear and cytoplasmic compartments within the eukaryotic cells enables spatial separation of the two processes, transcription and translation. The separation between the two processes allows for multiple steps in between for further modification and gene expression regulation, which becomes vital for physiological responses to extra- and intracellular signals.

mRNA nuclear export can be simplified to three stages:

  • 1) the pre-mRNA is transcribed in the nucleus, the site of mRNA synthesis, processing, and packing into mRNP (messenger ribonucleoprotein) complexes (as briefly described earlier)
  • 2) the mRNP molecules are targeted to and translocated through the nuclear pore complexes (NPC) of the nuclear envelope
  • 3) the mRNPs are released into the cytoplasm for translation to occur. Each of these stages involves numerous protein factors and other molecules that need to be recruited to carry out processes.

Formation of mRNP in yeasts:

  • 1) In the nucleus, transcription is mediated through RNA polymerase II. This is followed by modifications like the addition of the 5’ cap, splicing, and 3’ processing. The TREX complex is recruited during these processes and coordinates many of the next steps.
  • 2) The 3’ end processing is necessary because it generates the poly-A-tail which is crucial for the mRNA to be exported. This process requires the factors Rna14, Rna15 and Pcf11. Nab 2 is added onto the poly-A-tail mRNA then recruits Yra1 and Sub 2 during this time. When mRNA is in contact with Pcf11, Yra1 is transferred to the TREX subunit Sub2. (Yra1-Pcf11 binding is an important early step). Yra1 is necessary
  • 3) The MEx67-Mtr2 heterodimer is drafted.
  • 4) mRNPs can now be remodeled by tha DEAD-box helicase Sub2
  • 5) Yra1 dissociates itself from mRNP before export, along with the TREX complex.
  • 6) mRNP is drawn to the nucleus side of the NPC transport channel, where weak interactions arise with FG nucleoporins (proteins that perforate the nuclear pore). To increase the efficiency of export, several mechanisms exist to concentrate mRNAs at the nucleus side of the NPC. Eg: several actively transcribing genes like GAL1 are concentrated at the NPC.
  • 7) mRNP goes through the NPC transport channel, to the cytoplasmic side, where is once again goes through remodeling to prevent from going back into the nucleus.

Cofactors Involved

NCP and its facilitating proteins

The NPCs itself have very essential proteins that facilitate mRNA nuclear export. Within the NPC, there is a conical, basket-like feature that protrudes into the nucleus called the nuclear basket. It contains proteins like Nup 60 Nup2, and Mlp2. The cytoplasm similarly has proteins that are cofactors to the export process (Nup1259, Dbp5, Gle1). There are several other key proteins and components of mRNA export that will not be discussed, but they the references for this page will provide much more insight on the specific functions of these export factors.

Here is a short summary of the principle export factors for yeast and metazoans:

  • Mex67-Mtr2 (yeast) and Nxf1-nxt1 (metazoan): facilitate bulk mRNA export through NPCs
  • Yra1 (yeast) and ALY (metazoan): Adaptor linking Mex67-Mtr2 to mRNA molecule
  • Sub2 (yeast) and UAP56 (metazoan): DEAD-box helicase involved in assembly of export-competent mRNPs
  • Nab2 (yeast): Binds the poly (A) til of mRNA to Mlp1 and regulates length of the 3' poly (A) tail
  • Mlp1 (yeast) and (TPR): Nuclear basket protein that binds to Nab2
  • TREX (both yeast and metazoan): The complex involved in coordinating and regulating transcription
  • TREX-2 (both): directs actively expressing genes to NPCs
  • Gle1 and Gdf1 (yeast) and GLE (metazoan): enhances Dbp5 activity
  • Nup159 (yeast) and NUP214 (metazoan): cytoplasmic NPC protein that binds to Dbp5



Recruiting these factors is an essential step for the trafficking and quality control of the export. Most molecules that need to be transported from the nucleus into the cytoplasm involve karyopherin-mediated receptors, like small mRNA export. Its transport direction is based on the gradient of the GTP-bound state of the small GTPase Ran, making the mRNA export process uncharacteristic of normal protein export such as tRNA. Bulk mRNA is exported using Mex67-Mtr2, a non-karyopherin-mediated receptor, via the Nxfl pathway. The Mex67-Mtr2 molecule is recruited to the mRNP using the TREX component. Furthermore, recent works in vertebres shows that the binding of the Yra1 homologue, ALY, to mRNA is stimulated by the presence of the ATP bound form of the Sub2 homologue UAP56. This binding increases the ATPase activity of UAP56. Moreover, Nxf1 binds mRNA associated ALY, forming a ternary complex, and the RNA-binding affinity of Nxf1 is increased in the presence of ALY. Taken together, the events result in an mRNP with bount export receptors. But it is unclear how many receptors must bind a single mRNA for efficient export to occur.

Bulk mRNA Export Pathway

NPC protein interactions with export factors of mRNP complex

The Nxfl pathway involves a small set of transcripts that are exported via karyopherin Crm1, a protein that also mediates the export of incompletely spliced mRNA from HIV viruses. Therefore, if an mRNA molecule is not properly processed and spliced of its introns, it can be kept in the nucleus to degrade since it is recognized as a viral mRNA molecule. When the mRNP and mRNA are properly processed and have recruited all the necessary receptors and cofactors, it is considered export ready (export competent). The export-competent mRNP is then targeted only to the NPC using its recruited export receptor. The export receptor carries the mRNP to the NPC where it stays and interacts with the NPC proteins to allow recognition. The interactions can be nicely summarized in the figure below.

Bulk Release into Cytoplasm


The directionality of the bulk mRNA release is determined by another mechanism since it does depend on the RanGTP gradient for small mRNA export. It is determined by the function of two important export factors, Dbp5 and Gle1. The Dbp5 protein binds to the NPC cytoplasmic face by interacting with the NPC protein Nup214 As the mRNP comes closer to the cytoplasmic side of the NPC, it interacts with Dbp5 and Gle1. The binding and interactions between mRNP and the two proteins causes a conformational change and activates the removal of a set of proteins from the mRNP. It physically and spatially changes the mRNP making it suitable to be exported out of the NPC into the cytoplasm. These removed proteins are recycled and brought back into the nucleus where it goes through another cycle of mRNA export. In addition, as the mRNP enters the cytoplasm, specific cytoplasmic mRNA-binding proteins are incorporated. These specific links to translation further show the inherent connections between steps in gene expression.

Translational Significance


Since mRNA export is essential for proper gene expression, this process must be properly conducted. Incorrect steps in this export can lead to errors in transcription, and consequently translation. For example, errors in recruiting export factors can lead to incorrect mRNA production, and if the transcript is not recognized by nuclear surveillance the mRNA may be kept inside the nucleus and degraded by exosomes and various other enzymes. Errors in mRNA export can also be linked to many human diseases and developmental issues. Incorrect mRNA export are connected to perturbations that yield mutations in gene encoding export proteins or mRNA-binding proteins as well as mutations in genes that result in the inhibition of correct export of their own mRNA transcripts. Extreme cases also include the decreased regulation or hijacking of endogenous mRNA export complexes by viruses, which enables specific viral genes to hybridize with the mRNA transcript and be expressed in the organism. But with the vast knowledge of the mRNA export process, these malfunctions can be better understood and more easily preventable, and it may be possible to address many issues of diseases and gain a complete understanding of the way cellular function is generated at the simplest level: molecularly.



In prokaryotes, because the mRNA does not need to be modified or transported, it can be translated by the ribosomes right after transcription.

A picture of the translation process.

In eukaryotes, however, mRNA can only be translated after it has been modified and transported to the cytoplasm (the mature mRNAs). mRNA is translated into proteins on the ribosomes located on the endoplasmic reticulum. Translation starts by the ribosomes binding to a site on the 5' side. The ribosome moves along the mRNA until it comes across the start codon AUG. When this binding occurs, the ribosome is joined by an initiator tRNA that carries a formylmethionine (fMet) group that recognizes the start codon. Next, an aminoacyl-tRNA that can base pair with the next codon appears and joins the ribosome complex. Along with the aminoacyl-tRNA is the elongation factor EF-Tu (in bacteria) and a source of energy (usually GTP). The fMet (in bacteria) or Met group covalently bonds to the incoming amino acid of the aminoacyl-tRNA. The initiator tRNA is then released and the ribosome shifts one codon toward the 3' end. A new aminoacyl-tRNA arrives and the amino acid of this aminoacyl-tRNA binds to the previous amino acid. This process continues until the ribosome reaches a stop codon (UAA, UAG, or UGA). The newly bound amino acids are the translated mRNA into a protein. The ribosomal complex containing the tRNA splits back up into its separate parts, re-assembling when new mRNA needs to be translated into protein.

The elongation process "terminates" when a stop codon reaches the A site of the ribosome. Incoming tRNA, which carries the subsequent amino acid, will not be accepted by the ribosome at the A site. The A site will then be specific to a protein called the release factor. The release factor will hydrolyze the bond of the tRNA to the polypeptide in the P site, thus releasing the polypeptide chain. The two ribosomal subunits, release factor, and mRNA then come apart to signify the end of the termination process.

  • Stop Codon - A stop codon implies a sequence of three nitrogenase bases in the mRNA that signifies the termination of polypeptide elongation, or translation. The amino acid sequence is then released from the mRNA template to form its final 3D conformation.
Stop Codon Sequence



An mRNA can be changed its nucleotide composition in some instances. This process is called editing. In human, the apolipoprotein mRNA is one of the cases. This editing mRNA takes place in some tissues, but not all of them. In this edition, the mRNA's codon is given an early stop, therefore, it will produce a shorter protein when going to the translation process.

Alteration of mRNA sequence through base modification mRNA editing frequently generates protein diversity. Several proteins have been identified as being similar to C-to-U mRNA editing enzymes based on their structural domains and the occurrence of a catalytic domain characteristic of cytidine deaminases. In light of the hypothesis that these proteins might represent novel mRNA editing systems that could affect proteome diversity, we consider their structure, expression and relevance to biomedically significant processes or pathologies.



The message transported through mRNA after a certain amount of time will be degraded and be deleted. This process is called degradation. The cell can easily and quickly changed the protein production in case of any changing needs due to the lifetime of the mRNA. The lifetime of different types of mRNA can be different.The life span of mRNA molecules in the cytoplasm is an important key in determining the pattern of protein synthesis within a cell. Prokaryotic mRNA molecules often are degraded by enzymes within a few minutes of their synthesis and this is one reason as to why prokaryotes can vary their patterns of protein synthesis so quickly in response to changes in their environment. Eukaryotic mRNA, on the other hand, typically survives for hours, days, or for some instances, weeks. One example of multicellular mRNA is hemoglobin polypeptides which, in the process of developing red blood cells which are unusually stable, these long-lived mRNAs are translated repeatedly in the cell. Research done on yeasts suggest that a common pathway for mRNA degradation begins with the enzymatic shortening of the poly-A tail which helps trigger the action of enzymes that remove the 5’ cap. This removal of the 5’cap end is crucial as it is regulated by particular nucleotide sequences in the mRNA. Once the cap is removed, nuclease enzymes can then move in and rapidly chew up the mRNA. This process of mRNA degradation relies on deadenylation. The shortening of poly-A tail is initiated by deadenylase and afterward, mRNA is either fully degraded or stored in the case of certain cells.

Another mechanism that blocks expression of specific mRNA molecules known as MicroRNA (miRNA) or miRNAs have also become of interest. They are formed from longer RNA precursors that fold back on themselves, forming a long, double-stranded hairpin structure held together by hydrogen bonds. These small singled stranded RNA molecules can bind to complementary sequences in mRNA molecules and an enzyme, called the Dicer, can then cut the double-stranded RNA molecules into short fragments. One of the two strands is degraded and then the other stand, often the miRNA associates with a large protein complex and which allows the complex to bind to any mRNA molecule with a complementary sequence to either degrade or block translation of mRNAs.

Scientists also observed that gene expression inhibited by RNA molecules was possible. This was observed when they noticed that injecting double stranded RNA molecules into a cell somehow turned off a gene with the same sequence. Scientists called this phenomenon RNA interference or Interference RNA (RNAi). It was later discovered that this interference was due to small interfering RNAs (siRNAs) which are RNAs of similar size and function as miRNAs. Researched showed that the cellular machinery for making siRNAs was the same mechanism for creating miRNAs in the cell. The mechanisms by which these small RNAs function are also the same. Because the cellular RNAi pathway can lead to the destruction of RNA sequences complementary to themselves, it is believed that they originally acted as a natural defense against infection by RNA viruses.


  • David Hames, Nigel Hooper. Biochemistry. Third edition. Taylor and Francis Group. New York,2005.
  • Neil A. Campbell, Jan B Reece. Biology Seventh Edition, 2005 Pearson Education, Inc.

Nuclear export of mRNA. Murray Stewart. MRC Laboratory of Molecular Biology, Hills Rd., Cambridge CB2 0QH, UK

Structural Biochemistry/Nonsense-Mediated mRNA decay



Transfer RNA (tRNA) have a primary, secondary, and tertiary (L-shaped) structure. tRNA bonds to activated amino acids and transfers them to the ribosomes. Once at the ribosome, an initiator tRNA binds the amino acid to the ribosome to start translation. It carries the amino acids and binds to the Messenger RNA (mRNA) to form proteins.

tRNA's structure contains an amino acid attachment-site and a template-recognition site. The template-recognition site is called a anticodon and contains a sequence of three bases that are complementary to the codon on the mRNA. tRNA travels from nucleus to cytoplasm in a cell. Each tRNA can be used repeatably to be transcribed from DNA in nucleus.

There are 61 different anticodon sequences which code for the 20 amino acids. However, most prokayotic cells only have 30-40 different tRNAs and eukaryotes have about 50 different tRNAs. This is the third nucleotide of the codon, also called a wobble base, allows wobble pairing of the anticodon to the codon.

An example of the crystal structure of Yeast Phenylalanine of tRNA.

Role in Protein Synthesis


In protein synthesis, a tRNA molecule takes a specific activated amino acid to the site. The amino acid is esterified to the 3' or 2' -hydroxyl group of the terminal adenylate of tRNA. This joining of tRNA and an amino acid forms an aminoacyl-tRNA and is catalyzed by a specific enzyme called aminoacyl-tRNA synthetase (aaRS). There are 20 aminoacyl-tRNA synthetase, one for each amino acid. Similarly, there is a specific aaRS for each tRNA. The esterification reaction also called charging of the tRNA is powered by ATP.

The process of protein synthesis starts out when a charged tRNA (a tRNA with an amino acid attached), mRNA, and the small and large ribosomal subunits come together and form the initiation c complex, which consists of a peptidyl binding site (P site) and an aminoacyl binding site (A site). The first tRNA, otherwise known as the initiator RNA, binds to the mRNA start codon, AUG; thus, the first amino acid in the chain is methionine. To add additional amino acids to the polypeptide chain, a second charged tRNA must come in and have its anticodon bind to the next mRNA codon in the vacant A site. The P site and A site are in close proximity, thus allowing a formation of a stable peptide bond by reacting the carboxy terminus of the amino acid in the P site with the amino terminus of the amino acid on the tRNA in the A site. The reaction is catalyzed by peptidyl transferase. The complex moves along the RNA in a process called translocation which causes the tRNA in the P site to be displaced. The tRNA in the A site then moves into the P site so another charged tRNA can move into the A site. This process continues until the stop codon is reached the polypeptide chain is released from the ribosome.

1. amino acid + ATP --> aminoacyl-AMP + PPi 2. aminoacyl-AMP + tRNA --> aminoacyl-tRNA + AMP


tRNA Structure


2. Secondary Structure


The secondary structure is formed like cloverleaf structure because of four base-paired stems also called arms. The cloverleaf contains three non-base-paired loops: D, anticodon, and TpsiC loop. The terminal CCA is not base paired. It's duplexed between the 5'segment and 3'segment.

The acceptor stem which is not a loop is the site where the enzyme amino-acyl-tRNA synthase attaches an amino acid. It is located opposite of the anticodon arm which reads the mRNA.

There are different types loops. In D loop, D arm ends. Anticodon arms ends in anticodon loop. In the figure, it shows hydrogen bond present inside the loop structure. The hydrogen bonds stabilized the structure.

3. Tertiary Structure


For the tertiary structure, it can be described as a compact of L shape. It is three dimensional. The structure is bonded and stabilized by base pairing and base stacking. Base pairs between nucleotides in the D loop and the TΨC loop. At the end of the L shape is the three base sequence called anti codon.



The anticodon region of a transfer RNA is a sequence of three bases. They are complementary to a codon in the messenger RNA. In the translation, the pairing between its anticodon and the messenger codon brings the ribosome. The amino acid is attached at its 3' end. And it will be peptide bond. In prokaryote cells, there are about 35 tRNAs with different anticodons present. In eukaryote cells, there are 50 tRNAs with anticodons present. tRNA with the anticodon CCC is complementary to the anticodon GGG. The anticodon AAA is complementary to the anticodon UUU. Since each type of tRNA has a different one, the anticodon of tRNA is able to identify others well.

tRNA Aminoacylation


Aminoacyl-tRNA is an amino acid ester of tRNA. It can be called a charged tRNA. When a polypeptide chain is formed by the anticodon of the tRNA, the reaction is thermodynamically unfavorable. So, aminoacyl-tRNA is used to activate the formation. An amino acid is esterified to the 3'-end of a tRNA containing the corresponding anticodon in amynoaclyation of tRNA molecules. As a result, the aminoacyl-tRNA attaches amino acids to the tRNA. These paring of amino acids and tRNAs define the genetic code. The aminoacyl-tRNA synthestase(AARSs)catalyze the aminoacylation of tRNAs. During transfer the genetic information from the nucleotide sequence of a gene to the amino acid sequence of a protein, this process plays an important role. When errors occur, amynoacyl-tRNA synthetases edit mechanisms structurally. Further, it prevents the error synthesis and releases aminoacylated tRNA that shouldn't be placed.

Aminoacyl tRNA synthetases (AARSs)


AARSs is an enzyme that catalyzes the esterification of specific amino acid to a tRNA to form an aminoacyl tRNA. AARSs take a major role in translation during protein synthesis. In recent researches, scientists discovered that AARSs also take role in ex-translation.

Role of AARSs in translation


The accuracy of the protein translation depends on the exactness of AARSs' recognition of both the amino acid to be activated and the cognate tRNA molecules. That is a crucial step in the fidelity of the translation. All AARSs carry out the same two-steps reaction:

Step 1: AARSs binds ATP to the amino acid to induce an aminocyl-adenylate intermediate in which a covalent linkage between the 5'-phosphate in ATP and the carboxyl-end of amino acid.[4][5] Next, the AARSs use the generated energy from ATP hydrolysis to activate the amino acid which results in the formation of aminoacyl-AMP as an energy storage.[6]

Step 2: The amino acid is transferred to the appropriate tRNA and bind either 2'OH or 3'OH of the 3' adenosine terminal of tRNA covalently. The energy that stored in aminoacyl-AMP is used to transfer the amino acid to the tRNA to form aminoacyl tRNA.

Role of AARSs in ex-translation


Modified version of AARSs and natural fragments take role in ex-translational functions as confirmed in recent studies. The interplay of AARSs appears to be at the center of homeostatic mechanisma which controls angiogenesis, inflammation, metabolism, and tumorigenesis.[7] Through some recent experiments, ex-translational functions of AARSs was found to be the interplay between natural extracellular fragments of human TrpRS and TyrRS in angiogenesis. TyrRS was found to have a nuclear localization signal that is controlled by its cognate tRNA (called tRNA-Tyr) so that a decrease in level of tRNA-Tyr will increase the level of nuclear import of the AARSs which will induce effects on many gene regulatory mechanisms. In contrast, an increase in level of tRNA-Tyr will decrease the level of TyrRS. Thus, the subcellular distribution of TyrRS is directly controlled by the demands of protein synthesis and this control is an example of homeostatic mechanism that balances a translational with an ex-translational functions.[8].

However, recent reseaches also showed that the ex-translational function of AARSs is regulated which is a contrast to the discussion above. This regulation is considered an auto-balancing process in which a natural fragment takes control in the activity of its own original protein. Thus, further research is needed to confirm the specific role of AARSs in ex-translational functions.

Binding to Ribosome

tRNA's function is to bring amino acids to the ribosome during translation.

tRNA will bind at the A, P and E sites of ribosomes. The A site will bind to aminoacyl-tRNA which was signaled by the codon that is binding to that site. The codon will also signify the next correct amino acid that will be in the peptide chain. But the A site will only work when the P site has an aminoacyl-tRNA attaching to it. The P-site is actually occupied by a chain with a few amino acids called peptidyl-tRNA. It carries synthesized amino acid chains. Lastly, the E site carries the empty tRNA.

Three dimensional image of a tRNA.



PURPLE:Acceptor stem

RED: D arm

BLUE Anticodon arm

BLACK: Anticodon

GREEN: T arm

Diseases Caused by Mitochondrial tRNA Gene Mutation


Mitochondria are an organelle in the cell, which contains 22 tRNA. Gene mutation of tRNA will cause serious diseases. There are seven kinds of genes diseases caused by mitochondrial tRNA gene mutation:

Basal ganglia calcification, cerebellar atrophy, increased lactate; a CT image of a person diagnosed with MELAS

1- np5601 G->A and np3243 A->G gene mutations related to MELAS (Mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes). Most patients get this disease before 40-year-old with epilepsia and lactic acidosis. Some of them will die during 20~30 age.

2- np8363 G->A, np8356 T->C, and np8344 A->G gene mutations related to MERRF (Myoclonic Epilepsy with Ragged Red Fibers). MERRF affects central nerve system, causing epilepsia, Dementia and epicophosis.

Example of "ragged red fibers" in MELAS syndrome.

3- np4274 T->C gene mutation related to LIMM (Lethal Infantile Mitochondrial Myopathy). Most patients are newborn, having nerve defect and lactic acidosis, and die in one month.

4- np1644 G->T gene mutation related to subacute necrotizing encephalomyelopathy (SNE). This disease is familial autosomal recessive inheritance, happened to newborn baby.

5- np606 A->G gene mutation related to Rhabdomyolysis. Toxin produced by muscle cells is the main reason that causes Rhabdomyolysis.

6- np4500 G->A gene mutation related to the splenic lymphoma. The splenic lymphoma is a common malignant tumor happened on spleen. Normally, the splenic lymphoma caused by advanced stage lymphoma transfer.

7- np4336 A->G, np15927, and np15928 gene mutations related to Parkinson's Disease and Alzheimer’s Disease. Parkinson's Disease is a degenerative disorder of the central nervous system.


1. Inheritance of Mitochondrial Disease:

2. Diseases of Human Mitochondria tRNA:






Ribosomal RNA, also known as rRNA, is a significant component of the ribosome. rRNA fabricates the polypeptides and provides a mechanism for decoding mRNA into amino acids and interacts with the tRNA during translation. rRNA was once known to be the key structural component of ribosomes, but its actually found to be a catalytic element for protein synthesis. It is the most abundant type of RNA (about 80%) in the cell.

The Large 50S Subunit

rRNA is comprised of a large and small subunit. Prokaryotic rRNA is 70 svedbergs large. A svedberg is a unit of measurement for the sedimentation coefficient or how fast the molecule sediments when centrifuged. The 70S rRNA contains a large 50S subunit which includes a 23S and 16S subunit and a small 30S subunit which contains a 5S subunit. The 23S, 16S and 5S units are essential during protein synthesis, and the structure and function of the ribosomes. The formation of these RNAs take place by cleaving the primary 30S subunits and processing further by folding the molecule to form internal base pair structures. Experiments involving chemical probing methods have been conducted which have provided a detailed model of the secondary structure of the 16S subunit. The secondary structure was obtained through analyzing and comparing the sequences. Proteins containing the 16S ribosomal rRNA can fold and form the 30S subunit.The conformational change of the 16S ribosomal rRNA plays a crucial role in the assembly of the ribosome. The 5S unit found in the 30S subunit is an important part of the large subunit of most ribosomes found in organisms. rRNA is the most abundant of the three major types of RNAs with a 80% relative amount in E. coli for example, following by tRNA (15%) and finally mRNA (5%). Ribosomal RNA has a mass of 1.2 x 10^3 kd and 3700 number of nucleotides in E. coli.

With the help of x-ray crystallographic technique, scientists are able to reveal the detailed features of secondary structures.

The use of Polymerase Chain Reaction (PCR) has been of great importance in the amplification of rRNA genes. PCR is used to amplify rRNA genes in many organisms, however, it is found that the amplification of rRNA genes via traditional PCR methods cannot be conducted in extremely thermophilic organisms.

rRNA contains two tRNA binding sites, an A site and a P site. At the A site, the rRNA binds to a aminoacyl-tRNA, a tRNA bound to an amino acid. The amino acid is transferred to a peptidyl-tRNA containing the growing peptide chain. After the amino acid is added, the empty tRNA is moved to the P binding site where it is ejected. The mRNA then shifts 3 bases (1 codon) for the next aminoacyl-tRNA to bind to the A binding site.

In prokaryotes, rRNA are formed by cleavage and other modifications of nascent RNA chains. Therefore precursors of transfer and ribosomal RNA are cleaved and chemically modified after transcription (DNA --> RNA) in prokaryotes.

Base Pairing


rRNA takes great part of base-pairing between the codon and the anticodon. "Adenine 1493, one of three universally conserved bases in 16S rRNA, forms hydrogen bonds with the bases in both the codon and the anticodon only if the codon and anticodon are correctly paired."



M. Ogle and V. Ramakrishnan. Annu. Rev. Biochem. 74 (2005):129-177.

RNA interference in cultured cells.

Small RNA is a classification of RNA which includes small-interferring RNA (siRNA), micro RNA (miRNA), and piwi-interacting RNA (piRNA). These small RNA play important roles in biological and diseases processes.

siRNA & micro RNA


small-interferring RNA (siRNA) is a class of RNA molecules that are around 20-25 nucleotides in length. They are mostly involved with the RNA interference (RNAi) pathway in order to interfere with the expression of a specific gene.

siRNA is a type of double stranded RNA that was found target mRNA cleavage sites and were designed to target transcript silencing through transfection of the siRNA into mammalian cells. This allowed for the development of RNAi-based applications such as a new class of therapeutics.

micro RNA (miRNA) is a class of RNA molecules that are found in eukaryotic cells. They are generally 20-25 nucleotides in length and are also involved in translation repression and gene silencing. They were similar to siRNA and was found to negatively regulate expression of target transcripts.

The stem-loop of a pre-microRNA.

These two types of RNA were established as guides in governing silencing of target transcripts. This also raised questions of how these small RNAs were produced and it was found that immunoprecipitates in Drosophilia S2 cells processed the dsRNA (double stranded RNA) into the siRNA in vitro. miRNA was found to be derived from a conserved stem-loop precursor. This suggests that a dicing step could be required for miRNA biogenesis. The stem-loop forms part of a several hundred nucleotide long miRNA precursor which is then transcribes into miRNA. The existence of this precursor was found in Drosophia pupae.

In analyzing small RNA pathways in Drosophia, it was found that isolated dicer-1 and dicer-2 mutatnts were responsible for the biogeniss of miRNA and siRNA, respectively. Dicer-1 processed pre-miRNA independent of ATP while Dicer-2 processed dsRNA as ATP dependent. However, in mammalian cells, only one dicer generates both miRNA and siRNA.

siRNA effected silencing as they program RNAi effectors (such as RISC) to target mRNA. RISC is a magnesium dependent endoribonuclease that is affected by miRNA and siRNA to target mRNA cleavage activity.

miRNA has a controversial effector mechanism. This disparity is because there is a lack of a comparable well defined biochemical readout for miRNA induced RISC activity while there is a clear one from siRNA.

Making RISC


RISC: the effector complex for small RNAs

It is known that small RNAs aid in the regulation of gene expression. However, small RNAs cannot function individually to catalyze reactions. Instead, they come together and form RNA-induced silencing complexes (RISCs) in order to help with silencing genes and locating RISC targets. In this sense, the assembly of RISC is crucial for the small RNAs to do their job. [9]

Argonaute: the core component of RISC

The Argonaute (Ago) family of proteins is a main component of RISC that is essential to RISC’s function of target recognition and silencing. The Ago family can be divided into the Ago subfamily and Piwi subfamily. These Ago proteins, each with their own characteristics, are in charge of the functions of the small RNAs that they are paired with. SiRNAs and micro RNAs bind to Ago proteins while piRNAs bind to Piwi proteins. In mammals, the four proteins from the Ago subfamily (AGO1, AGO2, AGO3, AGO4) hinder translation in their target mRNAs, with AGO2 having the unique ability within its subfamily to induce RNA interference. In flies, AGO2 also triggers RNA interference in siRNA while AGO1 focuses on miRNA. What is different in flies compared to the case with mammals is that both AGO1 and AGO2 in flies can target cleavage and cause RNA interference. [9]

Two steps in RISC assembly: RISC loading and unwinding

There are two steps involved in RISC assembly. The first step is called RISC-loading, and this is when small RNA duplexes are incorporated into Ago proteins. Prior to this step, the double-stranded siRNAs and miRNAs are converted by RNase III enzymes (Drosha and Dicer) into small RNA duplexes: siRNA duplexes and miRNA-miRNA* duplexes. In the second step, the double-stranded small RNA duplexes are separated into two strands inside the Ago protein. Of the two strands, the strand with a less stable 5’ end is kept, serving as the ‘guide strand’. The other strand, called the ‘passenger strand’, is thrown out in order to produce a functional RISC. This strand selection in which one strand is preferred over the other is referred to as the ‘asymmetric rule’. [9]

Genome Encoded Small RNA


The Human Genome Project observed a relatively small number of protein-coding genes relative to genome size. It is believed that only five percent of the genome encodes proteins. miRNA, siRNA, and piRNA are part of the noncoding genome.

miRNA is believed to exist in hundreds of species and are identified through forward genetics by miRNA mutant isolation, bioinformation predictions based on the stem-loop, and direct cloning of small RNA. It is unclear how pre-miRNA is converted and there are studies to indicate that pri-miRNA and pre-miRNA occur separately in the nucleus and cytoplasm. dsRNA is a feature of pri-miRNA and aids in the processing into pre-miRNA. Dicer and Drosha are part of the factors required for the small RNA maturation. They are believed to function with dsRNA binding proteins which aid in the miRNA production.

Endo-siRNA play important roles in regulating genome functions in diverse species. They cleave target mRNA so that RNA-dependent RNA polymerases use the cleaved mRNA as templates to prime synthesis of secondary siRNAs. These are then loaded onto non-slicing agos to contribute to target silencing. This corresponds to the spreading of RNAi in mRNA and linked to silencing of worms.

piRNA are small RNA that also aid in the interference but focus on repetition. piRNA in mammalians are mapped uniquely in the genome and cluster to a small number of around 10 to 83 kb. Findings of the amplication of piRNA led to a ping-pong model in which it switches between Ago3 and Aubergine to create new piRN through each successive round. Different Piwi proteins conduct piRNA functions both cooperatively and independent of one another. piRNA play an important role in germ line development and the maintenance of genomic integrity. They are also involved in silencing but this is still unknown how. However, studies suggest that they regulate DNA methylation.



  1. Churchill, Das, Tyler The histone shuffle: histone chaperones in an energetic dance
  2. dsfdf
  4. Desogus, Gianluigi; Flavia Todone; Peter Brick; and Silvia Onesti. "Active Site of Lysyl-tRNA Synthetase: Structural Sudies of the Adenylation Reaction. Biochemistry, 2000 vol 39, 8418-8425.
  5. Klug, William, and Michael Cummings. Concepts of Genetics.5th Edition. Upper Saddle River, NJ: Prentice Hall, 1997.
  6. Hartweel, Leland; Leroy Hodd; Michael Goldberg; Ann Reynolds; Lee Silver; and Ruth Veres. Genetics: From Genes to Genomes. Boston: Mgraw-Hill, 1999.
  7. Guo, Min, and Paul Schimmel. " - Trends in Biochemical Sciences - Homeostatic mechanisms by alternative forms of tRNA synthetases." | Search through over 11 million science, health, medical journal full text articles and books.. N.p., n.d. Web. 7 Dec. 2012. <>.
  8. G. Fu et al. tRNA-controlled nuclear import of a human tRNA synthetase J. Biol. Chem., 287 (2012), pp. 9330–9334
  9. a b c Kawamata, Tomoko and Tomari, Yukihide. "Making RISC", '[Trends in Biochemical Sciences]', July 2010: 368-375. Retrieved on 21 November 2012.
  10. Liu, Qinghua; Paroo, Zain; Biochemical Principles of Small RNA Pathways Annu. Rev. Biochem. 79 (2010): 295-319.

Image from Wikipedia Commons



MicroRNAs(miRNAs) are short, single-stranded RNAs that are about 21 nucleotides in length. Their function is to regulate gene expression. Like other types of RNA, miRNAs are transcribed from DNA; However, they do not participate in protein translation. miRNAs are non-coding RNAs that bind to complementary mRNA and inhibit their translation. miRNAs and siRNAs both function to interfere with gene expression. However, miRNAs are single-stranded, whereas siRNAs are double-stranded.

miRNAs have been determined to play a crucial role in regulation of DNA damage response. Scientists believe that the transmission of generic information in eukaryotic cells requires accuracy in DNA replication and chromosome as well as the ability to sense and repair spontaneous and induce DNA damage. In order to maintain genomic integrity, cells undergo a DNA damage response, a complex network of signaling pathways. This network is composed of coordinates sensors, tranducers and effectors in cell cycle arrest, apoptosis and DNA repair.[1]

miRNAs have recently been linked to various diseases. Recent researches have shown that there is connection between dysregulation of miRNAs with certain diseases, which leads to the need of further researching in robust regulation of miRNA activity.[2]

miRNAs once were considered to be very stable molecules because miRNAs expression is known to be strictly controlled by the mechanisms acting at the level of transcription and also the processing of miRNA precursors. However, recently, scientists have figured out another mechanism that is important for miRNA homeostasis which is the active degradation of mature miRNAs. Degradation of miRNA takes role in dynamic miRNA expression patterns. Researches showed that miRNAs degradation can have affect on specific sets of miRNAs even though how this specificity comes about still remains unknown.[3]

Formation & Function


The main function of miRNAs is to regulate the translation of mRNA. In the nucleus, the miRNAs are first transcribed as primary miRNAs(pri-miRNAs) with caps and a poly-A tail. The pri-miRNAs are then processed into precursor miRNAs(pre-miRNAs) by an enzyme called Drosha. The structure of pre-miRNA is a 70 nucleotide-long stem-loop structure. The pre-miRNAs are then exported into the cytoplasm and split into mature miRNAs by an enzyme called Dicer. These mature miRNAs will integrate into the RNA-induced silencing complex(RISC) and activate the RISC. The activated RISC can then allow miRNAs to bind with the targeted mRNA and silence the gene expression. In animal cells, miRNAs are more commonly base paired with the mRNA and inhibit protein translation. The binding of miRNAs to complementary mRNA can degrade the mRNA and therefore terminate protein translation. Or miRNA can inhibit the reading of the 5'-cap and prevent translation. In plant cells, the miRNAs are more likely to perfectly bind with the target mRNA and promote cleavage. MicroRNA's are formed from the hairpins of long single-srranded RNA's that fold in on themselves. The double-stranded hairpins get cut by enzymes called dicer and results in a single-stranded microRNA. MircroRNA then forms a microRNA-protein complex and can then degrade a targeted mRNA and also block translation of targeted mRNA. In few instances, miRNA have shown signs of promoting translation, especially under starvation conditions. The reason for such activity are not known.

Canonical miRNA Function


Developing studies have demonstrated that miRNAs carry on a critical role in interacting with the canonical DNA damage response. The DNA damage response is an active system that includes commencement of transcriptional programs, enhancement of DNA restoration, and apoptosis if damage is severe. Breakages in DNA double-strands are mended by homologous re-fusion and non-homologous end-connecting repair pathways. Other forms of DNA damage are repaired by base excision repair (BER), nucleotide excision repair (NER), and DNA mismatch repair (MMR).

miRNA play a significant part in gene regulation and other cellular functions. Many important genes in the DNA damage response are managed by their corresponding miRNAs. For one, miRNAs monitor DNA damage response by way of target genes. In the process of DNA repair, chromatin remodeling takes place to permit DNA repair proteins access to DNA that are damaged. With more miRNA targets such as ATM, H2AX, and RAD52, DNA responsive genes are under inhibition by miRNAs. It has been revealed that higher expression of a certain miRNA -such as miR-421- will reduce ATM delivery, and downregulate H2AX in particular cellular situations.

DNA damaging agents in various treatments have proven to initiate miRNAs. Occurrences of DNA damage have depicted a correlation with the activation of miRNAs, underlying the significance that miRNAs regulate DNA damage response based on the magnitude of the DNA damage.

Noncanonical miRNA Function


Recent study has shown that by miRNA directly targeting the primary transcripts of other miRNA in the nucleus, it can control the biogenesis of the miRNA. A particular example is the miR-709 which negatively regulates the miR-15a/16-1. This particular miR709 is found in the mouse nucleus, and it binds specifically to miR15a/16-1 which are both 19-nucleotide recognition element. It clusters and blocks the processing of primary transcript of miR-15a/16-1 into the precursor. As such it regulates the maturation at a post-transcriptional level, which means post primary transcript but pre precursor. As such, because miR-709 can regulate the miR-15a/16-1, which in turn regulates the cell apoptosis, the miR-709 indirectly regulates the cell apoptosis. This in turn demonstrates that miRNA can affect the expressions of things within a cell because it can regulate the biogenesis of the other miRNA within the cells.

The miRNA can also regulate the long ncRNA. ncRNA are generally longer than 200 nucleotides and are non-protein-coding transcripts. The first experimental evidence that shows long ncRNAs are functional miRNA targets is shown by Hansen. In the experiment, the antisense transcript of the cerebellar degeneration related protein 1 (CDR1), which is a circular ncRNA, has been shown to be near perfect complementary with miR-671, which is in the nucleus. Within the experiment, miR-671 directs the cleavage of the CDR1 antisense transcript in an AGO2-dependent manner. Thus, with the negative regulation of the circular antisense ncRNA, it also decreases the CDR1 sense transcript. The study down shows that the antisense RNA can be destabilized by the miRNA through the AGO2 -mediated cleavage, and the sense mRNA can be stabilized by the circular noncoding antisense RNA.

miRISCs and Its Components


miRNA combine with Argonaute proteins and GW182 proteins to form miRNA-induced silencing complexes, or miRISCs. AGO attach to the N-terminus of the GW182 protein, while the miRNA bond to the AGO. The GW182 protein seems to be the more important of the two, as it contains the main silencing region. This was discovered when miRNA induced repression was still effective even after the knockoff of AGO in Drosophila cells.

miRNA Inhibition of Translation


miRNA possess several methods of inhibiting translation. Suppression can occur both before and post translation, although the former method seems to be preferred.

Before Initiation

  1. miRISC can suppress translation by interfering with the reading of 5' eIF4F-cap structure. The miRNA prevents the ribosome from reading the cap, thus initiation never starts. On the other hand, mRNA that were able to translate without the cap recognition step were not suppressed by the miRNA.
  2. miRNA are also able to prevent the creation of a functional ribosomal unit. On normal mRNA, the 40S and 60S ribosomal subunit come together to form the 80S complex, which helps translate the mRNA. miRNA inhibit 60S from joining with the 40S unit, making mRNA translation impossible. Translation is never able to start.

Post Initiation

  1. miRNA blocks the elongation of the new RNA being translated.
  2. The ribosome is forced to drop-off from the mRNA. The 40S and 60S ribosomal units split up before translation is complete.
  3. The miRNA induce preteolysis of the newly transcribed polypeptide chain. The chain is broken up by enzymes.

The mechanisms for the three post initiation inhibitors are known.

miRNA stability


In contrast to the suggestion in the past that miRNAs are highly stable, recent researches have shown that individual miRNAs in certain environments are subject to accelerated decay, which alters miRNA levels so that affects its activity.[4]

During miRNA biogenesis, miRNAs are transcribed by polymerase II as primary transcripts (pri-miRNAs) and the are matured in a multi-step biogenesis process to produce the mature and functional miRNA form. In one case, the pri-miRNAs are captured by polyaldenylated and are quite long (several kilibases long). Pri-miRNAs possess hairpin structures which includes the mature sequence of miRNA in their stem. In another case, the precursor miRNAs (pre-miRNAs) can be kept in introns of mRNAs or other non-coding RNAs. In either of these two cases, the nuclear RNAse type III enzyme Drosha in a complex with co-factor DiGeorge syndrome critical region 8 homolog (DGCR8), cleaves near the base stem which releases about 70 nucleotides pre-miRNA.[5]

Deadenylation and Decay


In deadenylation, the miRNA binds to AGO, GW182, and also poly(A)-binding protein (PABP). The PABP attaches to the GW182 protein, forming a slightly different miRISC. The miRISC removes the 5'-cap from the mRNA, which immediately causes decay of the mRNA. Deadenylation is effective because it rids the cell of excess mRNA, eliminating the chance of accidental translation. The decayed fragments are collected by the P bodies, and reused by the miRNA.

The degradation of miRNAs occurs under the aid of several miRNA-degrading enzymes. Many miRNA-degrading enzymes have been determined including both 3'to 5' and 5' to 3' exoribonucleases. Recent researches have shown that certain RNases were found to take the role in the turnover process of different sets of miRNAs in different organisms. However, the substrate specificity and phylogenetic conservation of individual miRNA turnover enzyme are still in the need of researching.[6]

microRNA-206 and Synapse Repair


In a mouse model of ALS: When mice get ALS, production of microRNA-206 is induced/increased. Deficiency of microRNA-206 accelerates the progression of the disease. -MicroRNA-206 is required for regeneration of damaged neuromuscular synapses (the signals between muscle and nerve cells). When synapse is damaged, microRNA-206 turns on repair. Without miRNA-206, synapses cannot be repaired; however, some synapses can grow back. -MicroRNA-206 does this through histone deacetylase and fibroblast growth factor (FGF) signaling pathways. Growth factors are specific signals from other cells that tell the cell to grow. -MicroRNA-206 blocks translation. It then activates histone deacetylation which condenses chromatin, therefore blocks transcription. -MicroRNA-206 slows the progression of ALS by repairing neuromuscular synapses.

MicroRNA genes are found in intergenic regions. These regions have its own miRNA gene promoter and regulatory units. Approximately forty percent of miRNA genes are lie in the introns of the proteins coding, non-proteins coding, and even in the exons. The miRNAs are found in the orientation that are regulated together with its own host gene. Between forty-two to fifty percent of other miRNA genes were shown in a common promoter, which originate from polycistronic units. The polycistronic units have a discrete loops of 3-6 where the mature miRNAs are being processed, but the miRNAs family are not homologous structure function. Hence, the promoters have a few identical motifs to other genes promoters that were transcribed protein coding genes from the RNA polymerase II. Also, in the DNA template does not have the finish during the mature miRNA production, because there is about five percent of human miRNAs shows RNA editing. The site-specific modification of RNA sequences to yield products different from those encoded by their DNA. The yield of the product allows to increases the diversity, the scope of miRNA action implied from the genome alone.

miRNAs and Disease




Recent studies have shown that miRNAs are involved in causing diseases. In the case of cancer, researchers found that miRNAs can inhibit the E2F1 protein that regulates cell proliferation. miRNAs bind to the mRNA first before translating the E2F1 protein. One microRNA, miR-21, was labeled as the first oncomir. It is known to aid in tumor growth and metastasis by targeting natural occurring tumor suppressors in the human body. Tropomyosin 1 (TPM1) is a direct target of miR-21, along with programmed cell death 4(PDCD4) and maspin, all of which are inversely correlated with the expression of miR-21 in the presence of tumors. This shows that miR-21 has the ability to target multiple genes and inhibit multiple metabolic pathways at the same time.

Kidney Fibrosis


Renal fibrosis is the excessive accumulation of fibrous tissues (connective tissues), occurring as a reparative process after scarring or trauma to the kidney. This type of nephropathy directly promotes renal dysfunction, which ultimately leads to kidney failure and death. Study has shown that a certain microRNA, miR-21, shows significant elevation in expression during the progression of kidney scarring. Experiments were conducted to validate this specific sequence and its effect in mice.

The abrogation of miR-21 in mice showed no overt abnormalities and no obvious suppression/prevention of tumor growths; however, these mice developed far less interstitial scarring tissue in response to kidney injury. Analysis has detected groups of genes and their subsequent metabolic pathways that were inhibited by miR-21. One of which involves peroxisome proliferator-activated receptor- α(Pparα), which is a lipid metabolism pathway that incorporates the synthesized anti–miR-21 oligonucleotides to inhibit miR-21. Pparα is found to ease the effects of ureteral-obstruction induced kidney fibrosis. miR-21 also regulates the redox metabolic pathway that involves a protein called Mpv171. The repression of Mpv171 in cells enhances kidney damage by reducing the production of oxygen radicals.

These studies demonstrate that miR-21 has a broad spectrum of influences on the microscopic scale and can be a suitable target for antifibrotic and cancer therapies.

Heart Disease


Another studies have shown miRNA inhibits the maturation in the murine heart, and plays an essential role during its development. The expression level of the miRNA is been changed in the disease of the human heart; it is the involvement in cardiomyopathies. During the heart disease development, they were several specific miRNAs that were been identified in animal models that were mostly in mice under pathological conditions. Those specific miRNA conditions key factors are important for cardiogenesis, the hypertrophic growth response, and cardiac conductance.



Fabian, Marc R., Nahum Sonenberg, and Witold Filipowicz. "Regulation of MRNA Translation and Stability by MicroRNAs." Annual Review of Biochemistry (2010): 351-79. Neil A. Campbell, Jane B. Reece "Biology 8th edition"

  1. Wan, Guohui, Rohit Mathur, Xiaoxiao Hu, Xinna Zhang, and Xiongbin Lu. " - Trends in Biochemical Sciences - miRNA response to DNA damage." | Search through over 11 million science, health, medical journal full text articles and books.. N.p., n.d. Web. 7 Dec. 2012. <>.
  2. Chang, T.C. and Mendell, J.T. (2007) microRNAs in vertebrate physiology and human disease. Annu. Rev. Genomics Hum. Genet. 8, 215–239
  3. Großhans, Rüegger . "MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] - PubMed - NCBI." National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. <>.
  4. Großhans, Rüegger . "MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] - PubMed - NCBI." National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. <>.
  5. Krol, J. et al. (2010) The widespread regulation of microRNA biogenesis, function and decay. Nat. Rev. Genet. 11, 597–610
  6. Großhans, Rüegger . "MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] - PubMed - NCBI." National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. <>.

1. Small nuclear RNA (snRNA)


Small nuclear RNAs (snRNA) are the small RNA molecules that are found in the nucleus of eukaryotic cells. They are usually 300 nucleotides or smaller and the nucleus contains more than just snRNA. The function of snRNA was discovered before the ribozyme enzyme by a few years. They are transcribed by RNA polymerase II or RNA polymerase III. They are important because they help in the process of pre-mRNA splicing and processing, which is the removal of the introns from hnRNA, and involved in the maintenance of the telomeres, or the ends of chromosomes. 5 snRNAs makes up the spliceosomes which are responsible for removing the introns from nuclear pre-mRNA eukaryotes. The spliceosome interacts with the ends of an RNA intron. It cuts at specific points to release the intron, then immediately joins the two exons that were adjacent to the intron. They are also responsible for mediating catalysis and aligning splice sites. Thomas Cech and Sydney Altman discovered that RNA molecules can serve as catalysts and changed the views of molecular evolution. snRNA are always found with specific proteins which make up the complexes called small nuclear ribonucleoproteins (snRNP), or snurps. The secondary structures are highly conserved in organisms ranging from yeast and human beings. Large groups of snRNA are called small nucleaolar RNA’s (snoRNA’s). snoRNA’s are responsible for cleaving eukaryotic long preRNA. They are important in RNA biogenesis and guide chemical modifications or ribosomal RNA (rRNA) and other RNA genes (tRNA and snRNA). Many snoRNA’s are created by processed introns. The host gene for the snoRNA is a ribosomal protein or translation factor. [43] [44]

2. Small Nucleolar RNA molecule (snoRNA)


snoRNAs, or Small Nucleolar RNA, modifies ribosomal RNAs (rRNAs) by mediating the cleavage of long pre-rRNA strands into its functional subunits (18S, 5.8S and 28S molecules). snoRNAs can also add the finishing modifications to rRNA subunits. [45]

3. Micro RNA (miRNA)


Micro RNA (miRNA) is a gene regulatory small RNA that is typically 21-23 nucleotides long. It is similar to small interfering RNA (siRNA)in that they bind to complementary mRNA molecules and inhibit their translation, however unlike siRNA which is a double strand RNA, miRNA is a single stranded RNA and it is only partially complementary to mRNA molecules. This class of RNA is non-coding.

Micro RNA has a great range of functions. It is used in cellular growth, development and insulin secretions among other things.

However it has been found that too much miRNA has been found to implicate diseases, such as Fraglie X Mental Retardation, as well as some forms of cancer.

4. Small interfering RNA (siRNA)


siRNAs, are known by several different names: small interfering RNA, short interfering RNA, and silencing RNA, were discovered by David Baulcombe’s research group in Norwich, England. They are roughly 20-25 nucleotides in length and are double stranded RNA molecules with overhanging 2 nucleotides on the 3'ends. They are largely responsible for the process of RNA interference (RNAi) pathways, which interferes with the expression of a gene. Other RNAi pathways such as antiviral mechanisms and shaping the chromatic structure of a genome are also mediated by siRNA. The discovery of siRNA’s ability to be synthetically produced, allowed for the induction of RNAi in mammalian cells. This then allows for research in drug development of the biomedical field such as treatment for the cure of Human Immunodeficiency virus (HIV). However using RNAi through the use of siRNA in living animals is difficult, because siRNA responds differently to different types of cells and the effectiveness varies from very well to poor. It is not yet understood why the effectiveness of siRNA in living animals varies so vastly. Artificial siRNA can be made synthetically by a phage enzyme which is called a dicer. It is the dicer enzyme that causes destruction of the double stranded RNA (dsRNA). By transfecting artificial siRNAs, specific transcripts are used to probe gene function. Although this is a useful tool, the high cost of production makes it nearly impossible for most laboratories and researchers to be able to use this method of probing gene functions by transfecting artificial siRNAs. Chemical synthesis, invitro transcription, or RNase 3-dicer digestion of long dsRNA’s (double stranded RNA’, in vivo from plasmids PCR cassettes, or viral vectors CMV or polymerase III transcription unit. SiRNA’s are used for loss of function studies. SiRNA’s are very sequence specific.

5. RNA Interference (RNAi)


RNAi was discovered Craig Mello and Andrew Fire in the 1990s. The experimented by antisense RNA experiments. It is the process in which double stranded RNA triggers the degradation of homologous mRNA.


RNA interference occurs when a double strand of RNA is broken down by an enzyme called Dicer. Dicer chops the double stranded RNA into short sequences 20-25 base pairs long. These base pairs then complex with the RISC enzyme and a homologous strand of RNA, which is then catalytically cleaved by RISC.



It is used to degrade mRNA in cells as a defense mechanism against Viral DNA that may have infected the cell and to shut down the effects of specific genes post transcription without having to regulate actual gene expression in the cell's DNA. This can be also used as a gene silencing technique. siRNA is put into a cell by transfection reagents. These reagents increase amount of RNA and DNA that can be absorbed by cultured cells. RNAi is used in the biomedical field to silencing disease causing genes. The RNAi can either be injected into specific cells or using modified viruses to transfect the cells. One common use of RNAi is in the birth control pill which stops sperm from fertilizing the eggs by splicing the gene that encodes protein to allow the sperm to bind to the egg. RNAi is also being used to knock out genes in salamanders in an attempt to discover which genes are responsible for their regenerative capabilities in an attempt to cure diseases previously thought to be incurable, such as Huntington's, Parkinsons, and Alzheimers by attempting to trigger the regeneration of the neurons whose death are responsible for such a disorder.

6. Interference RNA (iRNA)


Interference RNA or iRNA is used for gene regulation. It is an antisense RNA (complementary to other RNA, mostly mRNA). It is important for gene regulation and it is being researched currently for collective anti-cancer properties. It has ties to siRNA as siRNA is involved in the RNA interference pathway. RNA interference (RNAi) is a phenomenon of gene silencing at the mRNA level offering a quick and easy way to determine the function of a gene both in vivo and in vitro. [46]

7. RNA is a component of telomerase


Telomerase is a ribonucleoprotein (a ribonucleic acid-protein complex). It is an enzyme that maintains the telomeres (ends) of chromosomes during DNA replication. It has been found to be useful in the therapeutical, pharmaceutical, and diagnostic reagents.


8. Non-coding RNA (ncRNA)


Non-coding RNA is basically any RNA molecule that is not translated into a protein. Non-coding RNA can be found in many different forms of RNA, such as: ribosomal RNA (rRNA), transfer RNA (tRNA), and small RNAs [microRNA and small interfering RNA (siRNA)]. Non-coding RNA can be small or it can be very large. The small non-coding RNA molecules is also known as sRNA, whereas the large or long non-coding RNA is also known as lncRNA. The non-coding RNA molecule that was transcribed from DNA is often referred to as an RNA gene.

It is significant to note that there exists a growing interest in small, barely detectable non-coding RNA molecules because some of them have been found to play an important role in the regulation of gene expression. These small RNA molecules are known as RNA genes. In the early 1990s, American geneticist Victor Ambros and his colleagues first identified these molecules in the species of worm Caenorhabditis elegans. They were found to be responsible for turning off gene expression during worm development. This novel function was later discovered in other species as well. A decade later, another American geneticist Stephen R. Holbrook of Lawrence Berkeley National Laboratory in California discovered several other potential RNA genes previously undetected via a complex computer program called RNAGENiE. Currently, much research is being conducted over these tiny non-coding RNA molecules. In recent years, biotech and pharmaceutical companies have been looking into the potential of RNA genes as drug targets due to recent interest in RNA genes produced during bacterial infections and their pathogenic effects through the regulation of gene expression of host DNA.

9. Antisense RNA


Antisense RNA is an RNA strand that is complementary to the messenger RNA (mRNA) strand that transcribes within the cell. The antisense RNA is a single stranded RNA molecule. The antisense strand is brought into a cell in order to inhibit the translation of the mRNA. It does this through base pairing to the complementary mRNA strand, which obstructs the ability of the mRNA to translate.

Antisense RNA has been previously thought to be useful as a therapeutic technique for disease therapy, however over the past few years only one drug has been synthesized through the use of antisense RNA. It has been found that antisense RNA failed to have an effective design for disease therapy.

10. tmRNA


In an RNA, an RNAse can take off the 3’ end of an mRNA so that the mRNA has no stop codon for the ribosome sense and stop translation. Once the entire strand of mRNA is translated, this leads to the ribosome being stuck on the mRNA, with a peptidyl-tRNA in the P site of the ribosome. To fix this there is the tmRNA, which removes ribosomes that are stuck on an mRNA. This tmRNA has characteristics of both a tRNA and a mRNA.

In E. Coli, the tmRNA present is SsrA. The structure of this SsrA is arranged so that at one end there is an alanine attached with a tag sequence, and the SsrA is folded to look like a tRNA. The SsrA will enter the A site of the stuck ribosome and the Alanine on the SsrA will form a peptide bond with the polypeptide that is stuck on the ribosome. The tag sequence on the tmRNA is then translated like a mRNA and added to the amino acids on the stuck polypeptide. The string of about 12 added amino acids are called a proteolysis tag. At the end of the tmRNA, a stop codon will signal the ribosome to stop translation and detach itself as well as the SsrA-tagged peptide. SspB, a helper protein, can then recognize the proteolysis tag on the polypeptide chain and bring it to the protease, ClpXp, to be destroyed. [1]

11. Catalytic RNA


Catalytic RNA carry out enzymatic reactions. Catalytic RNAs are usually found near proteins where the catalytic activity is found in the RNA portion, rather than protein. [2]



  1. Joan L. Slonczewski, John W. Foster. "Microbiology: An Evolving Science."
  2. Joan L. Slonczewski, John W. Foster. "Microbiology: An Evolving Science."

RNA polymerase is an enzyme that produces RNA and catalyzes the initiation and elongation of RNA chains from a DNA template. RNA is created using a process known as transcription. The RNA polymerase is a key component to this process. The reaction that this enzyme catalyzes for is: (RNA)n + Ribonucleoside Triphosphate ->/<- (RNA)n+1 +PPi. RNA polymerases are relatively large. The size of RNA polymerase in a typical eukaryotic cell is roughly 500kDa. In bacteria it is roughly 400kDa and in T7 bacteriophage it is roughly 100kDa. Their speed of transcription is about 50 bases per second. A typical mRNA that codes for an average protein takes about 20 seconds in a prokaryotic cell and about 3 minutes in a eukaryotic cell. It is primarily longer in eukaryotes due to the fact that eukaryotic genes contain many segments that contain introns.

Requirements to Function


For DNA polymerases to properly carry out their function they must have the following components present for catalysis to occur. 1. A template of DNA. The preferable template is a double stranded DNA. Single stranded DNA may also work as a template but RNA strands or DNA-RNA hybrids may not be used. 2. Activated precursors. The reactions require ribonucleoside triphosphates: ATP (Adenine -ribose-triphosphate), GTP (Guanine-ribose-triphosphate), ATP (Adenosine-triphosphate), and UTP (Uracil-ribose-triphosphate). Nucleotides with three phosphates to the 5’ carbon of the ribose sugar.

Example of Ribonucleoside triphosphate (ATP)

3. Divalent metal. Unlike DNA polymerase, a primer is not needed but a divalent metal ion like magnesium ion or manganese ion is effective.

The direction of synthesis is from 5' to 3' and the synthesis is driven by the hydrolysis of pyrophosphate. There have been hybridization experiments that show RNA synthesized by RNA polymerase is complementary to its DNA template.


RNA Biogenesis Pol I, Pol II, and Pol III


Gene transcription takes place in the nucleus of eukaryotic cells and transcription is performed by three different multisubunit RNA polymerases, Pol I, Pol II, and Pol III. Still little is known today about the biogenesis of these RNA polymerases: from their origin of synthesis, the cytoplasm, to their arrival in the nucleus for transcription. Only until recently have studies shown that polymerase assembly intermediates, assembly factors and factors required for polymerase nuclear import exist in the cell cytoplasm. Pol II is the most identifiable one so is the basis of most studies on the biogenesis of RNA polymerase.

Structure and Assembly of RNA Polymerase II

RNA Polymerase II Complex

RNA Pol II transcribes mRNAs and small non-coding RNAs and contains 12 polypeptide subunits. Each RNA Pol has their own specified role in RNA polymerase. They all have ten identical subunit catalytic cores. The peripheral subunits are what differentiate their structure and function; RNA Pol II has been determined to contain cores that allow it to model the homologous cores in Pol I and Pol III. Pol I and Pol III will bind to opposite sides of Pol II (binding to Rpb1 and Rpb2) and are then divided into three interacting subunits.

3D Structure Model of RNA Polymerase II

The assembly of eukaryotic RNA core was first identified in studies of bacterial RNA polymerase because RNA Pol II core subunits are exactly identical to that of bacteria. Assembly of RNA Pol II is initiated by the formation of the αα dimer which interacts with the β and forms a bound complex intermediate. The active cleft in the RNA Pol II is composed of β subunits which are formed in the final step of assembly, so the polymerase will not be catalytically active until it is complete. RNA Pol II in both bacteria and eukaryotic cells has both exhibited formation in equivalent manner.

Assembly in vitro experiments have also been conducted to determine the origins of RNA Pol II. Using three mutant large subunits, their assembly was followed with the use of pulse chase experiments. Scientists found that Rpb3 and Rpb3 were the first to interact, and the bound complex then interacts with Rpb1. However, because larger mutated subunits were used, final assembly could not be complete without the use of Rpb6, Rpb10, and Rpb12, which are not normally part of final assembly in normal sized RNA Pol II. RNA Nuclear Import

If any RNA subunits are lost during its assembly, there will be an excess of Rpb1 present in the cytoplasm, meaning that the polymerase needs to be fully assembled before it is allowed to enter the nucleus and take place in transcriptase. Pol II nuclear localization factors have been identified to be functional polymerase-interacting proteins in the cell. The accumulation of Rpb1 is caused by the depletion of GPN1 and GPN3. The expression of GPN1 will lead to the depletion of excess Rpb1. GPN1 binding to Pol II can also be directly influence the ability of GTP to bind properly. Homologs of GPN1 also aid in the biogenesis and final assembly of Pol II. GPN1 interacts with the CCT complex, which chaperones many subunits in the formation of Pol II.

Nuclear Import Signal


The components of Pol II, the subunits and GPN proteins, are unable to produce a nuclear import signal, therefore, which is why a Pol II cannot enter the nucleus until it is fully assembled, so it can produce a signal. Iwr1 is a factor that interacts with fully assembled Pol II and adapts a nuclear signal onto it. And deletion of Iwr1 leads to a accumulation of all the Pol II subunits, showing that lwr1 is most likely the key to proper final assembly. Iwr1 binds to the active site on Pol II and can “sense” completion by interacting with the Rpb1 and Rpb2 subunits, ensuring that Pol II is fully assembled; this acts as the final checkpoint before entering the nucleus. Because deletion of Iwr1 affects the concentration of subunits in the cytoplasm, a nuclear export signal is used to trigger the recycling of Iwr1. Currently Iwr1 is only know to effect the subunits and factors involving Pol II upon depletion, nothing has been found on how it affects Pol I and Pol II.

Biogenesis for RNA Pol I and Pol III


The origins of Pol I and Pol III may depend on the chaperones Hsp90 and R2TP because the client proteins for these two chaperones were discovered to be the subunits of Pol I and Pol III. This makes sense because the deletion of A135, the Pol I subunit, results in Hsp90 binding to Pol I’s larging subunit, A190. Several bleaching experiments have been conducted on Pol I that revealed Pol I is assembled at the promoter sites. It unclear as to what happens to Pol I after transcriptase because it remains as a stable complex and does not dissociate, scientists are trying to determine whether or not Pol I is fundamentally different in other organisms.

Pol III is the least understood polymerase out of the three. A NLS sequence was discovered near the N-terminus of the second larger Pol III subunit, C128, and when this sequence is deleted it leads to the accumulation of C128 in the cytoplasm and other Pol III subunits. However, the other Pol III subunits remained intact and nuclear. This reveals that the core of Pol III is assembled within the cytoplasm and the released subunits bind the core of the nucleus. It appears that Pol III follows the same assembly pathway as that of Pol II, as revealed by native mass spectroscopy.

Due to the fact that all three RNA polymerases have at least ten identical subunits, we can draw the conclusion that all three polymerases can coordinate and simultaneous assembly. The study of a certain subunit in any of three polymerases can be better understood by also studying the other subunits at that stage of biogenesis.

RNA Polymerase Translocates


RNA molecules thousands of nucleotides long are synthesized by multi-subunit DNA-dependent RNA polymerases. Nucleotide condensation’s reiterative reaction happens at rates of tens of nucleotides per second. This is consistently linked to the translocation of the enzyme along the DNA template (threading of the DNA and emerging RNA molecule through the enzyme. This reiteration of the nucleotide addition/translocation cycle without separating the DNA from the RNA involves both isomorphic and metamorphic conformational flexibility to such a magnitude that it accommodates the essential molecular motions. [1]

Types of RNA Polymerase




Eukaryotic cells have three types of RNA polymerases. Pol I: This type of RNA polymerase synthesizes RNA for the large subunits of ribosomes. Ribosomes are pretty much the protein making organelle in cells. Pol II: Creates mRNAs. Messenger RNAs provide a template for protein synthesis for ribosomes. It also creates many small nuclear RNAs which help modify RNA after they are formed. Pol III: Creates tRNAs. Transfer RNAs is basically for the small subunit of ribosomes.

These three types of polymerases can be distinguished from one another in lab by the level of inhibition by the alpha-amanitin poison. PolI is completely resistance to this poison. PolII is highly sensitive to this poison. And PolIII is moderately sensitive.

RNA polymerases in eukaryotic cells are composed of several subunits. Majority of them are small and unique to each type of polymerase. However there are two large subunits that are similar among all of the polymerases. This fact highlights that all these polymerases must have evolved from an original polymerase. The two large subunits are the functional core of this enzyme. The other smaller subunits tend to provide the specific functions for each distinct type of polymerase.



In bacteria, the RNA polymerase holoenzyme is made up of two parts, a core polymerase and a sigma factor. The core polymerase has the components needed for elongation in transcription, while the sigma factor is only needed for transcriptional initiation. The core polymerase is made up of two α’s, one β, and one β’ unit (α2 β β’), while the sigma factor is only made up of s. In total, there are 5 subunits in RNA polymerase-- alpha (α), beta (β), beta' (β '), sigma (s), and omega (w). However, the function of omega is unknown and is thought to possibly stabilize RNA polymerase.

In bacterial DNA, the promoter sequence is recognized by the sigma unit of the RNA polymerase. Upon recognition of the promoter sequence, the sigma factor will guide the RNA polymerase to the promoter. This sigma factor will then bind the RNA polymerase to the promoter through the α unit of the core polymerase. [2]



Archaeal RNA polymerases are pretty similar to eukaryotic RNA. Especially similar to RNA Polymerase II. These polymerases may have evolved from stripping down eukaryotic systems. An archea polymerase is used in PCR because it can withstand the high temperature used to split DNA strands.



RNA polymerase have a multitude of structural features that help in the transcription process. A Structure known as the clamp keeps the polymerase anchored to DNA . The flap ensures that the MRNA is retained. The rudder prevents DNA/RNA hybrid from occurring. DNA does not enter the mouth of the polymerase directly. It is usually held sidewise with a sharp bend to its left as it exits the polymerase. mRNA is believed to leave from the back of the polymerase. NTPs enter the active site as the same channel that DNA is pulled through but through a secondary channel .

Typical RNA polymerase structure

Similarities and Differences between RNA Polymerase and DNA Polymerase


The synthesis of RNA and DNA is similar in many aspects. Both of them follow the synthesis direction of 5'->3'. Another is that the method of elongation is by the 3'OH group at the terminus of the growing chain that makes a nucleophilic attack on the innermost phosphate of the incoming nucleoside triphosphate. Another similarity is that the synthesis is driven by the hydrolysis of pyrophosphate. However the difference between the two is that RNA polymerase does not require a primer unlike DNA polymerase which does. Also although DNA polymerase can actually correct mistakes in the nucleotide transcription, RNA polymerase lacks this ability to excise the mismatches nucleotides.


  1. Macromolecular micromovements: how RNA polymerase translocates. Svetlov V, Nudler E.
  2. Joan L. Slonczewski, John W. Foster. "Microbiology: An Evolving Science."

Transcription Elongation Complex (TEC)


To start transcription, RNA Polymerase (RNAP) must recognize and bind to a promoter sequence. Some factors include assisting the polymerase to an open promoter complex in which the DNA exposes the bases, forming a transcription bubble. Then, RNAP typically undergoes an abortive initiation in which the process synthesizes short strands of RNA transcripts. RNAP returns to the initial promoter site and escapes the region by forming a stable, transcription elongation complex (TEC) which is able to transcribe the whole gene.

Single-molecule Techniques


Atomic Force Microscopy (AFM)


Atomic force microscopy is a technique used to image the ultrastructural alteration in the TEC such as the change in bend angles of the template DNA induced by RNAP. The TEC is placed on a flat surface then scanned with a AFM cantilever which is a beam anchored at one end. Then, deflections are detected by a laser that reflects the surface. This allows the reconstruction of two-dimensional image of transcriptional complex.

Atomic Force Microscopy

Single Molecule Fluorescence


Another technique used to monitor transcription is to fluorescently tag the RNAP itself. This method allows the monitor of promoter search or elongation with minimal perturbation. Specifically, the structural change in TEC can be examined by using the method called Fluorescence resonance energy transfer (FRET). FRET can follow the distance between two nucleotides by measuring the intensity change in fluorescence.



By attaching beads to single RNAP molecules, one can record the position of these beads to determine the change in location or rotational state of the enzyme. Specifically, the beads can be sensitively measured by measuring the light scattered from the bead or the rotational states. One can also apply force on the beads with an OT. OT is a tightly focused beam of infrared laser light that exerts forces on the beads by means of radiation pressure. In addition, force can be applied by means of laminar fluid flow. The end of the DNA template can be attached to a second bead so that fluid flow can exert force on the free bead which place tension on the DNA template.

Transcription Initiation




Transcription requires a binding of the holoenzyme to DNA promoter sequence that is placed throughout an excess of genomic DNA. This is a problem that is common to all sequence-specific DNA-binding proteins. Two independent mechanisms, sliding and intersegment transfer, have been proposed to enhance binding by increasing its efficiency of the search process. Sliding transfer occurs when RNAP associates with nontarget DNA by diffusing in a random “walk” until it reaches the target site. Meanwhile, transegment transfer involves polymerase searching for the promoter by crossing from on position to another, bound simultaneously to both DNA segments.

Open-Complex Formation


When locating a promoter site, the RNAP undergoes a structural transition from the closed complex to the open complex (OPC). The RNAP bends and unwinds a segment of DNA with the aid of initiation factors such as "sigma", creating the transcription bubble. "sigma" is dubbed as the “housekeeping” factor that directs RNAP to recognize vast number of promoters in bacteria. For instance, AFM reading of E. Coli promoter revealed that the DNA is bent between 55̊ and 88̊ which is a consistent measurement from the bend angles inferred from gel mobility assays.

Abortive Initiation


After forming OPC, RNAP starts the synthesis of RNA oligonucleotide complementary to the DNA template strand. Although RNAP creates highly stable complex during elongation phase, the initially transcribing complex (ITC) is highly unstable causing spontaneous release of short RNA chains and restarting synthesis which is known as “abortive initiation.”



On-Pathway Elongation


During transcription, RNAP translocates along the template DNA synthesizing an mRNA that has thousands of nucleotides in length. When the mRNA reaches 9-11nt in length, RNAP leaves the promoter region and enters the elongation phase. In this step, the TEC complex is very stable and remains tightly bound to both the DNA template and the nascent RNA during nucleotide addition. The major stabilizing factor of the complex is thought to be the base pairing within the RNA:DNA hybrid. The “sliding clamp” model states that the extensive protein-nucleic acid contacts within the polymerase greatly contributes to RNA retention, increasing the overall stability. The “clamp” that consists of narrow protein channels surround the hybrid to prevent any shearing motion between the RNA and the DNA.

Off-Pathway Events


The process of on-pathway elongation is frequently interfered by entry into off-pathway states that plays an important role in regulating RNA synthesis. One example of RNA regulation is transcriptional pausing during elongation. The puases can reduce rate of mRNA production, recruit factors for the TEC that modify the subsequent transcription, function as a precursor to termination, or lead to messenger splicing. The long “stabilized” pauses are known to play a regulatory role in formation of RNA hairpins in the transcript which is thought to inactivate RNAP. Series of studies have displayed that pauses lasting 20 seconds or more indicates a rate of base misincorporation during RNA synthesis, suggesting in need for proofreading.



Termination is a tricky step because of the stability of the TEC complex and RNAP must dissociate accurately releasing the mRNA and the DNA template. In prokaryotes, the termination occurs at specific sequence that code for a stable hairpin in the nascent RNA. In general, termination might be caused through allosteric interactions between RNA hairpin and RNAP that trigger the TEC to release the substrates to stop the reaction. Some studies concluded that termination occurs due to an intermediate elongation-incompetent state whereas some studies support that termination occurs rather quickly with no intermediates.



Herbert, Kristina M., William J. Greenleaf, and Steven M. Block. "Single-Molecule Studies of RNA Polymerase: Motoring Along." Annual Review of Biochemistry 77.149-76 (2008): 149-172. Print. RNA-dependent RNA polymerase is an enzyme, which catalyzes the replication of RNA from an RNA template. Usually, the typical RNA polymerase is well known that are catalyzes the transcription of mature RNA from a DNA template.



The most famous RdRP in a virus is the polio virus 3Dpol. The virus is made up of RNA which enters the cell through receptor-mediated endocytosis. The RNA is able to act as a template for complementary RNA synthesis. The complementary strand of the RNA is able to act as a template, in order to produce new viral genomes which are packaged and prepare to lyse from the cell transfer to other cells for more infection. This method of replication there is no DNA; therefore the replication is rapidly.However, the downside is that there is no 'back-up' DNA copy.

There is several eukaryotes that have RdRPs, and the RdRPs are involved in RNA interference; these amplify microRNAs and small temporal RNAs. Also, they produce double-stranded RNA from using the small interfering RNAs as primers. The RdRPs are used in the defense mechanisms, but it can be usurped by RNA viruses for their benefit.

Polio Virus


The first interaction for the polio virus is with a host cell; it consists two materials: binding to a specific cell surface protein, and the poliovirus receptor (PVR). The PVR, is a cell surface sialylated glycoprotein, and is a member of the immunoglobulin superfamily (is a loop in the structure of the protein that is a Ig domain). Therefore, PVR has three Ig loops that are on the outside of the cell. The loops begins with the most farthest of the cell surface. In loop 1, the polio virus binds to it receptor, which the receptor molecule binds on the virus particle.

The poliovirus genome is made of positive sense single stranded RNA that encodes a polyprotein of aa's in the range of 2100-2400. Both ends of the genome are modified; in the 5' end is modify by a covalently attached basic protein VPg which consist of 23 aa's, and the 3' end by polyadenylation. In a series of cleavages, viral proteases cleave themselves out and break down the polyprotein into 10 separate gene products involved in replication and packaging.

The viral proteases 2A cleaves the p220 subunit of the cap binding complex; therefore, they make host cell from the mRNA unrecognizable to ribosomes. The 2A protease abrogates most of the host cell's own protein synthesis. Viral mRNA depends on a 5' UTR that contains an internal ribosome entry site; serves as a ribosome docking site to the subunits of ribosomes.

File:RDR Polio.jpg
The Structure of RDR in Polio

Replication occurs entirely in the cytoplasm. In addition, they serve as a template for protein synthesis, the positive sense strand genome is utilized as a template for the synthesis negative sense strands. On the other hand, the host cells has a lack of necessities to replicate RNA. Poliovirus uses a viral RNA-dependent RNA polymerase to produce RNA molecules of the opposite polarity. Viral protein VPg covalently attached to uridine, which serves as the primer. The first round of replication produces a single antisense molecule. The antisense template is used to produce copies of the original genome, which they are packaged into viral capsids before it gets release.

The virus has been translated to its own RNA, so it produce the necessary proteins, and the virus genome is replicated. However, the virus needs to package the newly synthesized RNA molecules inside capsids, and must need the RNA packaged in order the virus is completed. The capsid proteins self-assemble into an immature capsid that has a structure of which proteins were needed, but the final form of the virus is not finished to cleaved. The mature poliovirus capsid has icosahedral symmetry, and have 60 copies of viral capsid proteins that are VP1, VP2, VP3, and VP4. The viral RNA enters the incomplete capsid and is secured inside when the viral proteases make the final cleavages. Once the genomes have been packaged into mature virions, the virus particles await the cell's lysis in order to be released. As many as 100,000 virions can be released from a single infected cell.

There is a conformational changes in the capsid, because there was a binding in the virus with the receptor. VP4, an internal capsid protein detaches from the capsid. The capsid swells and the poliovirus genome is susceptible to degradation. When VP1 is released, the genome is released onto the cytoplasm of the cell. The viral entry strategy is very inefficient; only 1% of the viruses initiate an infection.


edit Discovered in the 1980s, RNA helicases are enzymes that use ATP to bind and remodel RNA and ribonucleoprotein complexes (RNPs). Mostly all helicases work and interact with many other proteins inside a multi-component assembly. While it is it unknown how RNA helicases exactly locate their binding sites on the complexes, experiments show that they most likely either bind to cofactors, which then guides them to the complex, or the helicases themselves find the binding sites according to a complex code of features on the RNAs. RNA helicases also play an important role in eukaryotic RNA metabolism and are found in all kingdoms of life. But little is known about them and how they work in the cell. RNA helicases are similar to DNA helicases and share similar functions.

RNA Helicase Classifications


RNA helicases can also be classified into six superfamilies (SFs). SFs 1 and 2 are comprised of helicases that are non-ring forming. All eukaryotic RNA helicases belong to these superfamilies. SFs 3 to 6 are helicases that can form rings and can be found in bacteria and viruses.

SFs 1 and 2 can be broken down into well-defined helicase families. Each family has distinct structural and functional properties. Six of the families have RNA helicases while the rest consist of DNA helicases. Helicases in SF 1 and 2 have a core made of two similar helicase domains and have at least 12 characteristic sequence motifs at positions in the helicase core. Not all helicases in one family will have the same motifs but they have high sequence conservation. In other families, sequence conservation is low. Across superfamilies, sequence conservation is even lower. This suggests differentiation between DNA and RNA helicases was not an evolutionary force in the classification of helicase families.

The helicase core is also surrounded on either side by C- and N-terminal domains. The terminal domains are essential to the helicase’s cellular specificity because they assist specific complexes in recruiting proteins. They accomplish this through their interactions with other proteins or by recognizing specific nucleic acid sections. Unlike the core’s sequence motifs, the C- and N-terminal domains are not conserved between families. Certain families in SF1 and SF2 are also identifiable by their characteristic beta-hairpin in between the VA and VI motifs of the helicase core. The helicase families who show this beta-hairpin are the Ski2-like, DHeAH/RHA and NS3/NPH-II families. Other families, such the Upf1-like and RIG-I-like families, have noticeable inserts between or within the helicase core domains.

NPH-II helicase, found in vaccinia virus, and NS3 helicase from the hepatitis C virus are two RNA helicases that are essential for viral replication, and have had extensive studies. Both of these helicases load on a 3' single strand of RNA and moves toward the 5' end of the strand. These helicases begin to unwind the RNA through bursts and pauses, beginning at the junction of the single and double strand. During pauses, the helicase could be either preparing to continue unwinding, but it could also dissociate from the RNA. As of the present, there is still little known about the fundamental characteristics of the helicases acting on the RNA.
Source: Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.

RNA Helicase Mechanisms


RNA helicases employ two mechanisms for unwinding: canonical and by local strand separation. Both methods are ATP-dependent because ATP binding is needed not only for the helicase to bind to the duplex but to also keep the two helicase domains together. In both canonical and local strand unwinding, the helicase domains surround the nucleic acid in similar directions and make contact with the RNA’s sugar-phosphate backbone. This allows for complete attachment of the RNA helicase and movement along the RNA by 1nt per ATP consumed. Many translocating helicases can move in bursts of up to 18 nt steps before they perform a rate limiting step, allowing for quick unwinding of the RNA duplex. In local strand unwinding, the bound RNA strand often show bends in its backbone due to the presence of ATP analogs while in canonical unwinding no such bend is exhibited. The bends decrease preference for the duplex structure and most likely represent the RNA conformation of the two strands after the duplex is unwound.

Canonical Duplex Unwinding

Canonincal Unwinding Mechanism for RNA Helicase

When RNA helicases unwind RNA strands canonically, the RNA helicase attaches itself on the single-stranded region of the RNA strand and then translocates along the bound strand. It has defined direction and can either go 3’ to 5’ or 5’ to 3’ as it displaces the complementary strand. Each translocating step has multiple processes, including ATP binding and hydrolysis. ATP binding and hydrolysis drives the process forward. This type of winding requires strands to have single-stranded regions in a defined polarity with respect to the duplex. The RNA helicases families who are known to perform this mechanism are Upf1-like, Ski2-like, RIG-I-like and DEAH/RHA.

Local Strand Separation

Duplex Unwinding by Local Strand Separation

Local strand separation occurs when a RNA helicase loads itself directly on a duplex region of the RNA, and uses ATP to separate the strands. Unlike the canonical method, this type of unwinding does not require a single-stranded region with specific orientation nor ATP hydrolysis. ATP binding is sufficient for duplex unwinding to occur, ATP hydrolysis however is needed to successfully detach the helicase from the RNA. Sometimes the enzyme will dissociate before the strands have completely separated because of the strands quickly re-annealing. As the RNA strand gets longer, however, this type of unwinding is unfavorable and inefficient. The DEAD-box family unwinds duplexes this way and can only handle duplexes with 10 to 12 basepairs.

Other Functions


RNA helicases also have other functions aside from unwinding duplexes. It can also displace proteins on RNAs. This is called RNP remodeling. RNP remodeling appears to be important in how RNA helicase functions since RNAs are usually attached to a protein in vivo. RNP remodeling, however, is not essential in unwinding but also works for helicases that unwind canonically and for DEAD-box proteins. Some helicases can only remove certain proteins, while others can remove a wider variety of proteins.

RNA helicases have also shown to help in the RNA folding process. An example are the RNA helicases who facilitate and regulate RNA folding in fungal mitochondria as RNA chaperones. They should not be confused with protein chaperones which also help in RNA folding. RNA chaperones guide the RNA through the series of folding steps while continually proofreading. It determines if the substrates formed are correct or incorrect. If correct, the process is continued but if incorrect, the substrate is disregarded and the RNA chaperone opens up a new reaction path for RNA folding. Protein chaperones on the other hand catalyze the steps of the folding pathway and help stabilize the subsequent RNA structure.

Other helicase families show activities in regards to the innate immune system. The RIG-I RNA helicase translocates to a RNA duplex but instead of unwinding it, the helicase acts as a pattern recognition receptor and determines if a viral RNA present in the cytoplasm. It detects the viral RNA based upon the long double stranded RNA’s it creates during viral replication. Only viral RNA's are detected because a majority of RNA’s in eukaryotic cells form only short RNA duplexes, which is ideal for the local strand unwinding process.

DEAH/RHA helicases also help in both the separation of a spliced mRNA from the spliceosome and the ligation of the exon to the mRNA. It separates the spliced mRNA from the spliceosome by attaching to the mRNA and moving from the 3’ to 5’ direction, breaking the RNA-RNA and RNA-protein along the way.



Jankowsky, Eckhard. "RNA helicases at work: binding and rearranging." Trends in Biochemical Sciences. xx (2010): 1-11. Web. Riboswitches are recently discovered RNA domains that function as gene expression regulators. It is a portion of the mRNA strand that is able to bind small molecules and alter the gene activity. An mRNA which possesses the riboswitch is able to regulate its own activity depending on whether or not a molecule is attached to it. They are located at the 5' end of untranslated regions of messenger RNA. These functional domains exist in bacteria and have also been engineered in the laboratory[47]. Riboswitches are significant because most believe that proteins are primarily responsible for the complexity, specificity, and efficiency of gene control. Most riboswitches exist in bacterias although some have also been found in plants and fungi[48].

It was first described by Ronald Breaker's lab in 2002 when they utilized in-line probing of Escherichia coli btuB mRNA to show that it could bind a metabolite/substrate and inhibit translation of the strand's product (AdoCbl) -- without proteins[49].

The original meaning of riboswitch was that messenger RNA can sense small molecules of metabolite. While this is still the use today, others have changed the meaning to include other types of RNA, further expanding the meaning. mRNA that contains a riboswitch can regulate its own activity. This opens many doors in the world of biology because it shows that molecules can evolve to be their own masters, or regulating themselves. These RNA were seen to distinguish between very similar molecules or analogs which shows the intricacy of the method. This fact has opened up a world of RNA because it is now known that the capabilities of RNA were much greater than once known. It is interesting because it illustrates how little we humans know about our very own bodies. Riboswitches allow RNA to respond to different concentrations of molecules almost as though the RNA had a mind of its own determining its actions. Due to the expansion of the definition of a riboswitch, there are many different kinds known to mankind today.

As the mantra of structural biochemistry is that structure determines function, it is not a surprise that the structure of the riboswitch allows for such great function. Most RNA do not need to conform to the strict watson and crick model of DNA allowing for many variations in RNA. The great variation in RNA is responsible for riboswitchs abilities. Riboswitches are made of two parts. the aptamer domain and the expression platform. The aptamer domain essentially acts as a receptor that binds to specific ligands. The expression platform is interesting because it can toggle between two different secondary structures when binding to a ligand, creating a plethora of possible structures. In both parts of a riboswitch there is a switching sequence. This switching sequence directs the expression of the genes. [1]

Types of Riboswitches


There are several types of riboswitches known, some of which are:

  • TPP riboswitch : this riboswitch binds TPP (thiamin pyrophosphate in order to regulate the transport and synthesis of thiamin as well as other metabolites with similar properties.
  • Lysine riboswitch : binds to lysine and regulates its biosynthesis, catabolism, and transport.
  • Glycine riboswitch : this riboswitch regulates glycine metabolism. This is the only riboswitch known currently to be able to perform cooperative binding.
  • FMN riboswitch : this riboswitch binds FMN (flavin mononucleotide) in order to regulate the transport and synthesis of riboflavin.
  • Purine riboswitch : binds purines to regulate its transport and metabolism. Different forms of this riboswitch are able to bind either guanine or adenine depending on the pyrimidine in the riboswitch.
  • Cobalamin riboswitch : this riboswitch binds adenosylcobalamin, the coenzyme form of B12 vitamin, in order to moderate the synthesis and transport of cobalamin and other similar metabolites.

as well as many others such as SAM riboswitch, PreQ1 riboswitch, SAH riboswitch, glmS riboswitch, and cyclic di-GMP riboswitch.




Riboswitches consist of two functional components, the conserved aptamer region and the highly variable expression platform. Unlike proteins, only four nucleotides are available to generate the specificity required by the riboswitch to bind[50].

The aptamer domain is usually a single binding site that has a highly conserved primary and secondary RNA structure and forms selective binding pockets for ligands. It essentially acts as a sensor for metabolites within the cell. Since it is located at the 5' end of mRNA, it is usually the first to be transcribed by RNA polymerase.

To improve aptamer-substrate affinity, structural data shows that hydrogen bonds, van der Waals, and other interactions form with the substrate and also adjacent RNA regions. Other aptamers may utilize an induced fit mechanism with deep binding pockets[51].

The expression platform is commonly located downstream from the aptamer.



Most riboswitches function within feedback pathways by sensing metabolites and turning "off" the ability to express genes that would produce proteins that would continue the production of that metabolite[52]. The aptamer region tends to recognize ligands that are closely related to the gene products downstream from the riboswitch expression platform.


  1. ^ Wang, J., Lee, E., Morales, D., Lim, J., Breaker, R. "Riboswitches that Sense S-adenosylhomocysteine and Activate Genes Involved in Coenzyme Recycling". Molecular Cell 29, 691–702, March 28, 2008.
  2. ^ Nahvi, A., Sudarsan, N., Ebert, M., Zou, X., Brown, K., Breaker, R., "Genetic Control by a Metabolite Binding mRNA" Chemistry & Biology, Vol. 9, 1043-1049, September, 2002.
  3. ^ Coppins, R., Hall, K., Groisman, A. "The intricate world of riboswitches" Current Opinion in Microbiology, Volume 10, Issue 2, April 2007, Pages 176-181.
  4. ^ Breaker, R. "Complex Riboswitches''Science, Vol. 319, 1795-1797, 28 March 2008.
  1. Riboswitches, November 14th, 2012.

How RNA Unfolds and Refolds


In general, RNA unfolds from the tertiary structure to secondary structure to single stranded RNA and vice-versa is true for how RNA folds. RNA unfolding depends on temperature to denature RNA or sometimes enzymes such as RNA-dependent RNA polymerase (RdRps) or helicases. Moreover, scientists use the techniques called optical tweezers, which is also called laser tweezers, and fluorescence resonance energy transfer, also known as FRET, to study how secondary and tertiary RNA structures unfold and refold. Furthermore, scientists use cation binding to study how ribozymes fold and unfold.

Secondary Structure RNA


Secondary RNA structure can unfold by increasing the temperature or using chemical reagents to denature RNA. Another technique used to study how RNA unfolds is optical tweezers. This technique applies a force that causes RNA to unfold in physiological temperature and buffer solutions (79). For example, the ends of a hairpin RNA have two beads—one that has an optical trap and the other has a micropipette strap. From this, RNA can be pulled and unzipped as the micropipette moves.

RNA refolding occurs in the reverse process of RNA unfolding. When micropipette moves, RNA can be pushed back which makes RNA relaxed and refolds RNA. However, if the relaxation force applied by optical tweezers increases, this can cause RNA to misfold (81).

Misfolding in RNA can be corrected by increasing the force. When force is increased, the RNA will try to refold into an active and functional form.

Tertiary Structure RNA


Tertiary RNA structure is relatively weak therefore, by changing the temperature or solutions that are not much different from the physiological state can destabilize RNA interaction.

A technique called FRET, fluorescence reson