Structural Biochemistry/Protein function/DNA Binding

There exists a variety of interactions between proteins and DNA that are necessary for biological processes, in which proteins must recognize specific sections of DNA. These interactions can be categorized by what the proteins use to recognize and interact with in DNA. Proteins use a combination of these interactions in order to achieve specificity in DNA binding.


It was previously thought, based on early low-resolution x-ray structures, that the set of nucleic acids present in the major groove of DNA helices presented a set of bases that correspond with a complementary sequence of amino acids for the sake of recognition specificity. This theory of recognition is referred to as direct readout. Although this mechanism of recognition is common and provides a significant amount of what is in the Protein Data Bank, it has been realized that simple one-to-one correspondence between codes is insufficient in recognizing the specificity of the protein-DNA interactions. In some cases of DNA recognition, interactions of the protein and the DNA strand are less direct, and the interactions are not likely to occur if not for some sort of deformation of the DNA helix conformation. These interactions are defined as indirect readout mechanisms.

Categorization of Protein-DNA RecognitionEdit

Illustration of major and minor groove in DNA

The two main categories are base readout and shape readout. Base readout is when the protein recognizes the specific chemical signatures of different nucleic acid bases. Shape readout is the recognition of the shape of DNA sequences.

Base ReadoutEdit

Base readout can be further categorized into readouts that occur in the major groove versus those that occur in the minor groove. Hydrogen bonding is one mechanism of DNA recognition by proteins; it is a greater source of specificity in the major groove as compared to the minor groove due to the pattern of hydrogen bond donors and acceptors available. In the minor groove, the hydrogen donor/acceptor patterns do not distinguish A:T from T:A and G:C from C:G. Specificity based on hydrogen bonds is based both on the number of donor-acceptor pairs and the unique hydrogen bonding geometry. When A:T bind together they make two hydogen bonds. When G:C bind together they make three hydrogen bonds. Hydrogen bonding has also been noted to be mediated by water molecules; for example, in the Trp repressor enzyme water molecules are found to bridge hydrogen bonds. However, this type of water-mediated hydrogen bonding mechanism of recognition has only been found in the case of major groove readout, not in cases of minor groove readout.

Displacement of water molecules from the minor groove may also be used as a thermodynamic driving force for the binding of DNA. Hydrophobic effects may also be used in recognizing specific bases, like pyrimidine groups as compared to purine groups. Although hydrogen bonding is effective in recognizing purine bases, like adenine and guanine, contacts with pyrimidines are mainly hydrophobic.

This is an example of a DNA bend with protein 1p78

Shape ReadoutEdit

Shape readout can be divided into global and local shape recognition. Variations of DNA shape are dependent on the chemical interactions of each base pair, which results in a unique conformational signature. Specificity in readouts depends on variations from the usual B-DNA structure, and result in binding less ideal DNA conformations.

Local shape readout is dependent on two main variations: narrow minor groove and DNA kinks. DNA kinks are when the helix's linearity is broken due to base pairs unstacking. This promotes optimal contact between the amino acid and DNA base.

Global shape readout is categorized as when the entire binding site of the DNA is not in the ideal B-DNA structure. Examples of these structures are A-DNA, bent DNA, and Z-DNA. In the A form of DNA, sugar structures that are typically not exposed are due to the expanded minor groove, and thus can contact nonpolar amino acids, such as alanine, leucine, phenylaline, and valine. In Z-DNA, the position of the phosphate groups is recognized. For example, RNA adenosine deaminase recognizes the zig-zag phosphate patterns on the left handed helix.


Rohs, Remo (2010). "Origins of Specificity in Protein-DNA recognition". Annual Review of Biochemistry. Retrieved 2011-11-15.