Methods and Concepts in the Life Sciences/Protein Tags

Protein tags


Protein tags are peptide sequences genetically grafted onto a recombinant protein. Often these tags are removable by chemical agents or by enzymatic means, such as proteolysis or intein splicing. Tags are attached to proteins for various purposes.

Affinity tags are appended to proteins so that they can be purified from their crude biological source using an affinity technique. These include chitin binding protein (CBP), maltose binding protein (MBP), and glutathione-S-transferase (GST). The poly(His) tag is a widely used protein tag; it binds to metal matrices.

Solubilization tags are used, especially for recombinant proteins expressed in chaperone-deficient species such as E. coli, to assist in the proper folding in proteins and keep them from precipitating. These include thioredoxin (TRX) and poly(NANP). Some affinity tags have a dual role as a solubilization agent, such as MBP, and GST.

Chromatography tags are used to alter chromatographic properties of the protein to afford different resolution across a particular separation technique. Often, these consist of polyanionic amino acids, such as FLAG-tag.

Epitope tags are short peptide sequences which are chosen because high-affinity antibodies can be reliably produced in many species. These are usually derived from viral genes, which explain their high immunoreactivity. Epitope tags include V5-tag, Myc-tag, and HA-tag. These tags are particularly useful for western blotting, immunofluorescence and immunoprecipitation experiments, although they also find use in antibody purification.

Fluorescence tags are used to give visual readout on a protein. GFP and its variants are the most commonly used fluorescence tags. More advanced applications of GFP include using it as a folding reporter (fluorescent if folded, colorless if not).

Protein tags find many other usages, such as specific enzymatic modification (such as biotin ligase tags) and chemical modification (FlAsH) tag. Often tags are combined to produce multifunctional modifications of the protein. However, with the addition of each tag comes the risk that the native function of the protein may be abolished or compromised by interactions with the tag. Therefore, after purification, tags are commonly removed by specific proteolysis (e.g. by TEV protease, Thrombin, Factor Xa or Enteropeptidase)





The peptide sequence of the FLAG-tag from the N-terminus to the C-terminus is DYKDDDDK (10-12 Da). It can be used in conjunction with other affinity tags, for example a polyhistidine tag (His-tag), HA-tag or myc-tag. Additionally, it may be used in tandem, commonly the 3xFLAG peptide: DYKDHDG-DYKDHDI-DYKDDDDK (with the final tag encoding an enterokinase cleavage site). It can be fused to the C-terminus or the N-terminus of a protein. Some commercially available antibodies (e.g., M1/4E11) recognize the epitope only when it is present at the N-terminus. However, other available antibodies (e.g., M2) are position-insensitive.

The FLAG-tag was the second example of a fully functional epitope tag to be published in the scientific literature and was the only epitope tag to be patented. Unlike some other tags (e.g. myc, HA), where a monoclonal antibody was first isolated against an existing protein, then the epitope was characterized and used as a tag, the FLAG epitope was designed first, and then monoclonals were raised to recognize it. The FLAG tag's structure has been optimized for compatibility with the proteins it is attached to, in that it is more hydrophilic than other common epitope tags and therefore less likely to denature or inactivate proteins to which it is appended. In addition, N-terminal FLAG tags can be removed readily from proteins once they have been isolated, by treatment with the specific protease, enterokinase (enteropeptidase).



Glutathione S-transferases (GSTs) comprise a family of eukaryotic and prokaryotic metabolic isozymes best known for their ability to catalyze the conjugation of the reduced form of glutathione (GSH) to xenobiotic substrates for the purpose of detoxification.

Due to its ability t bind to immobilized glutathione it can be used as a fusion protein. A GST-tag is often used to separate and purify proteins. The tag is 220 amino acids (roughly 26 KDa) in size, which, compared to tags such as the Myc-tag or the FLAG-tag, is quite large. It is fused to the N-terminus of a protein. However, many commercially available sources of GST-tagged plasmids include a thrombin domain for cleavage of the GST tag during protein purification



A polyhistidine-tag is an amino acid motif in proteins that consists of at least six histidine (His) residues, often at the N- or C-terminus of the protein. It is also known as hexa histidine-tag, 6xHis-tag, His6 tag and by the trademarked name His-tag.

Polyhistidine-tags are often used for affinity purification of polyhistidine-tagged recombinant proteins expressed in Escherichia coli and other prokaryotic expression systems. Bacterial cells are harvested via centrifugation and the resulting cell pellet lysed either by physical means or by means of detergents and enzymes such as lysozyme or any combination of these. At this stage raw lysate contains the recombinant protein among many other proteins originating from the bacterial host. This mixture is incubated with an affinity resin containing bound bivalent nickel or cobalt ions, which are available commercially in different varieties. Nickel and cobalt have similar properties and as they are adjacent period 4 transition metals (v. iron triad). These resins are generally sepharose/agarose functionalised with a chelator, such as iminodiacetic acid (Ni-IDA) and nitrilotriacetic acid (Ni-NTA) for nickel and carboxylmethylaspartate (Co-CMA) for cobalt, which the polyhistidine-tag binds with micromolar affinity. The resin is then washed with phosphate buffer to remove proteins that do not specifically interact with the cobalt or nickel ion. With Ni-based methods, washing efficiency can be improved by the addition of 20 mM imidazole (proteins are usually eluted with 150-300 mM imidazole). Generally nickel-based resins have higher binding capacity, while cobalt-based resins offer the highest purity.

Polyhistidine-tagging is the option of choice for purifying recombinant proteins in denaturing conditions because its mode of action is dependent only on the primary structure of proteins.

Polyhistidine-tag columns retain several well known proteins as impurities. One of them is FKBP-type peptidyl prolyl isomerase, which appears around 25kDa (SlyD). Impurities are generally eliminated using a secondary chromatographic technique, or by expressing the recombinant protein in a SlyD-deficient E. coli strain. Alternatively cobalt-based resins do not bind SlyD from E. coli and can be used for a single-step purification.



Maltose-Binding Protein (MBP) is a part of the maltose/maltodextrin system of Escherichia coli, which is responsible for the uptake and efficient catabolism of maltodextrins. MBP has an approximate molecular mass of 42.5 kilodaltons.

MBP is used to increase the solubility of recombinant proteins expressed in E. coli. In these systems, the protein of interest is often expressed as a MBP-fusion protein, preventing aggregation of the protein of interest. The mechanism by which MBP increases solubility is not well understood. In addition, MBP can itself be used as an affinity tag for purification of recombinant proteins. The fusion protein binds to amylose columns while all other proteins flow through. The MBP-protein fusion can be purified by eluting the column with maltose.



The Strep-tag is a synthetic peptide consisting of eight amino acids (WSHPQFEK). This peptide sequence exhibits intrinsic affinity towards Strep-Tactin, a specifically engineered streptavidin and can be N- or C- terminally fused to recombinant proteins. By exploiting the highly specific interaction, Strep-tagged proteins can be isolated in one step from crude cell lysates. Because the Strep-tag elutes under gentle, physiological conditions it is especially suited for generation of functional proteins.



Tandem affinity purification (TAP) uses two different kinds of affinity chromatography in order to purify a protein.

The original TAP method involves the fusion of the TAP tag to the C-terminus of the protein under study. The TAP tag consists of calmodulin binding peptide (CBP) from the N-terminal, followed by tobacco etch virus protease (TEV protease) cleavage site and Protein A, which binds tightly to IgG. The relative order of the modules of the tag is important because Protein A needs to be at the extreme end of the fusion protein so that the entire complex can be retrieved using an IgG matrix.

There are a few methods in which the fusion protein can be introduced into the host. If the host is yeast, then one of the methods may be the use of plasmids that will eventually translate the fusion protein within the host. Whichever method that is being used, it is preferable to maintain expression of the fusion protein as close as possible to its natural level.

Once the fusion protein is translated within the host, the new protein at one end of the fusion protein would be able to interact with other proteins. Subsequently, the fusion protein is retrieved from the host by breaking the cells and retrieving the fusion protein through affinity selection, together with the other constituents attached to the new protein, by means of an IgG matrix.

After washing, TEV protease is introduced to elute the bound material at the TEV protease cleavage site. This eluate is then incubated with calmodulin-coated beads in the presence of calcium. This second affinity step is required to remove the TEV protease as well as traces of contaminants remaining after the first affinity step. After washing, the eluate is then released with ethylene glycol tetraacetic acid (EGTA).

Many other tag combinations have been proposed since the TAP principle was first published.