Proteomics/Introduction to Proteomics



Introduction to Proteomics
Protein Sample Preparation
Protein Sample Preparation
List of Topics
List of Topics
Protein Sample Preparation
Protein Sample Preparation
Previous page
Proteomics Next page
The Process of Proteomics
Introduction to Proteomics

What is proteomics?

Information transfer in the central dogma of biology

The focus of proteomics is a biological group called the proteome. The proteome is dynamic, defined as the set of proteins expressed in a specific cell, given a particular set of conditions. Within a given human proteome, the number of proteins can be as large as 2 million. [1]

Proteins themselves are macromolecules: long chains of amino acids. This amino acid chain is constructed when the cellular machinery of the ribosome translates RNA transcripts from DNA in the cell's nucleus. [2] The transfer of information within cells commonly follows this path, from DNA to RNA to protein.

Proteins can be organized in four structural levels:

  • Primary (1°): The amino acid sequence, containing members of a (usually) twenty-unit alphabet
  • Secondary (2°): Local folding of the amino acid sequence into α helices and β sheets
  • Tertiary (3°): 3D conformation of the entire amino acid sequence
  • Quaternary (4°): Interaction between multiple small peptides or protein subunits to create a large unit

Each level of protein structure is essential to the finished molecule's function. The primary sequence of the amino acid chain determines where secondary structures will form, as well as the overall shape of the final 3D conformation. The 3D conformation of each small peptide or subunit determines the final structure and function of a protein conglomerate. [3]

There are many different subdivisions of proteomics, including:

Proteomics has both a physical laboratory component and a computational component. These two parts are often linked together; at times data derived from laboratory work can be fed directly into sequence and structure prediction algorithms. Mass spectrometry of multiple types is used most frequently for this purpose. [5]

The importance of proteomics

Proteomics is a relatively recent field; the term was coined in 1994, and the science itself had its origins in electrophoretic separation techniques of the 1970's and 1980's. [6] The study of proteins, however, has been a scientific focus for a much longer time. Studying proteins generates insight on how proteins affect cell processes. Conversely, this study also investigates how proteins themselves are affected by cell processes or the external environment.

Proteins provide intricate control of cellular machinery, and are in many cases components of that same machinery. [7] They serve a variety of functions within the cell, and there are thousands of distinct proteins and peptides in almost every organism. This great variety comes from a phenomenon known as alternative splicing, in which a particular gene in a cell's DNA can create multiple protein types, based on the demands of the cell at a given time.

The goal of proteomics is to analyze the varying proteomes of an organism at different times, in order to highlight differences between them. Put more simply, proteomics analyzes the structure and function of biological systems. [8] For example, the protein content of a cancerous cell is often different from that of a healthy cell. Certain proteins in the cancerous cell may not be present in the healthy cell, making these unique proteins good targets for anti-cancer drugs. The realization of this goal is difficult; both purification and identification of proteins in any organism can be hindered by a multitude of biological and environmental factors. [9]

Protein structural levels of interest in proteomics

Proteomics Workflows

The first step of proteomics is sample preparation. In this step, we are trying to extract protein from cells. In the second step, we use methods such as 2D electrophoresis to separate different proteins. Then we try to cut proteins into peptides since peptides are easier to detect. In the forth step, we use mass spectrometry to detect peptides and peptides fragments. Finally, we can then determine the sequence of the protein by interpreting all the data obtained.


Broad-Based Proteomics

Broad-based Proteomics Approach vs traditional focused approach

Because Proteomics is growing at a very rapid pace, there is a shift in the field away from a specialized/focused way of conducting studies and towards a more global perspective. Broad-based proteomics presents a unique perspective on the field of proteomics because it allows for one to take on this general perspective by setting out to understand the proteome as a whole. A critical aspect to this strategy is planning ahead; and in doing so, the most appropriate plans and technologies can be implemented in the most efficient manner. By developing a strategy tailored to understanding a particular proteome, problems and setbacks can be avoided during the study.

The first step when utilizing broad-based proteomics is to develop a hypothesis specific to the proteome being studied. It is best to choose organisms that already have a great deal of genomic information available, since the genome is always a useful supplement to proteomic information. Once the a hypothesis and organism are established, the proper technologies should be chosen; and these technologies should be compatible with whatever biological factors are present (i.e. sample type). Some important and relevant proteomic methods include HPLC, Mass Spectrometry, SDS-PAGE, two-dimensional gel electrophoresis, and perhaps in silico protein modeling.

Since there are multitudes of sample type, sample preparation, and analytical technology combinations possible, it is obvious why careful planning from a broad-based proteomic perspective is critical. By planning upfront, an efficient proteomic study can be conducted. And when the efforts of many broad-based proteomic studies are taken together, understanding the proteome in its entirety becomes a realistic possibility.


  1. ^ American Medical Association. "Proteomics."
  2. ^ Hartl, Daniel L., Jones, Elizabeth W. "Genetics: Analysis of Genes and Genomes". Jones and Bartlett Publishers: Boston, 2005.
  3. ^ Weaver, Robert F. "Molecular Biology, 2nd Edition". McGraw Hill: Boston, 2002.
  4. ^ Twyman, Richard. "Proteomics."
  5. ^ Colinge, Jacques and Keiryn L. Bennett. "Introduction to Computational Proteomics". PLoS Comput Biol. 2007 July; 3(7): e114.
  6. ^ "History of Proteomics." Australian Proteome Analysis Facility.
  7. ^ Graves, P. R., T. A. J. Haystead. "Molecular Biologist's Guide to Proteomics". Microbiology and Molecular Biology Reviews: Vol.66 No.1, 2002.
  8. ^ "Proteomics Overview."
  9. ^ van Wijk, K. J. "Challenges and Prospects of Plant Proteomics". Plant Physiol. 2001 June; 126(2): 501-508.

Chapter Written by J. Reuter (Zel2008) and S. Lafergola (DieselSandwich)

Articles Summarized

Advances in Proteomic Workflows for Systems Biology

Johan Malmstrom, Hookeun Lee, and Ruedi Aebersold. Curr Opin Biotechnol 18(4):378-384 (2007)

Main Focus

The article summarizes recent improvements as well as some principal limitations of shortgun tandem mass spectrometry based proteomics. Furthermore, it also briefly introduces steps of targeted driven quantitative proteomics.


In recent years, great improvements have been made in all the parts of non targeted mass spectrometry based proteomics including sample preparation, data acquisition, data processing and analysis. In the sample preparation process, with the introduction of IEF separation method, resolution obtained from classical two dimensional chromatography peptide separation is greatly improved. Improvements are also made in the field of data quality which is increased by the development of highly reproducible capillary chromatography methods and quantitative analysis by stable isotope labeling method. High mass resolution and accuracy could be achieved now by different types of mass spectrometry such as TOF-TOF,Q-TOF in the data acquisition process. Furthermore, different types of mass analyzers and ion sources have been combined to increase the proteome coverage. With the development of database search tools, the quality of proteomics data could be more accurately assessed and estimated in the data processing and analysis process.

Despite all these improvements achieved, limitations exist in shotgun approaches. For example, shotgun MS datasets are extremely redundant which greatly affect the identification of peptides present in proteomic samples. The existence of semi-tryptic or non-tryptic peptides in samples made the sample more complex. Saturation effect greatly reduces the discovery rate of new proteins. Many peptides that detected by Mass Spectrometry could not be identified, making it difficult to compare sample to sample.

The limitations of shotgun approaches made the development of targeted driven quantitative proteomics necessary. The first step of targeted driven quantitative proteomics is protein and peptide selection. This step could be finished both experimentally and computationally. For the multiple reaction monitoring (MRM) and data analysis step, multiple reaction monitoring was applied to proteomics data analysis. Relevance to the course: this source is a brief overview of recent improvements in targeted mass spectrometry (one method of proteomics) based proteomics as well as some limitations. It also introduced another field of proteomics: targeted driven quantitative proteomics.

New Terms

Electrospray ionization
A technique used in mass spectrometry to produce ions.It is especially useful in producing ions from macromolecules because it overcomes the propensity of these molecules to fragment when ionized (
Matrix-assisted laser desorption/ionization (MALDI)
A soft ionization technique used in mass spectrometry, allowing the analysis of biomolecules (biopolymers such as proteins,peptides and sugars) and large organic molecules (such as polymers, dendrimers and other macromolecules), which tend to be fragile and fragment when ionized by more conventional ionization methods(
A multi-organism, publicly accessible compendium of peptides identified in a large set of tandem mass spectrometry proteomics experiments(
Multiple reaction monitoring
MRM experiments, using a triple quadrupole instrument, are designed for obtaining the maximum sensitivity for detection of target compounds. This type of mass spectrometric experiment is widely used in detecting and quantifying drug and drug metabolites in the pharmaceutical industry(
FT-ICR mass spectrometry
Fourier transform ion cyclotron resonance mass spectrometry, also known as Fourier transform mass spectrometry, is a type of mass analyzer (or mass spectrometer) for determining the mass-to-charge ratio (m/z) of ions based on the cyclotron frequency of the ions in a fixed magnetic field(

Course Relevance

This source is about non targeted mass spectrometry and targeted approaches which are important methods in the identification of proteins(an important step in proteomics).

Broad-Based Proteomic Strategies: A Practical Guide to Proteomics and Functional Screening

Graham David, Elliot Steven, Van Eyk Jennifer. "Journal of Physiology" 563: 1-9 (2005)

Main Focus

This article summarizes what broad-based proteomics is and how one can design a study using this global-view strategy. It first briefly looks at the current technology in proteomics and then discusses how these technologies can be incorporated into a study.


Proteomics as a field is becoming a very daunting one to enter because many studies are getting lost in the complicated focused details. To help assist with this challenge, a researcher can employ broad-based proteomics. Broad-based proteomics is a strategy where careful planning is employed upfront to answer a question about a proteome (for instance, comparisons between a tissue in a diseased state and a normal state) using the most appropriate and applicable technologies available. By developing a strategy at the beginning of a proteomics study, possible setbacks during the study are avoided.

The first step is to develop a general hypothesis that is specific to the problem or issue that is being studied. Since proteomics mirrors genomics, a proteomic study is increasingly difficult when the genome of the model organism isn't known. For this reason, organisms where the majority of the genome is known (80% or greater) should be chosen. Once a proper organism has been chosen for study, the next factors to consider are the type of data that will be generated and also the sample source. Some proteomic methods yield qualitative data, while others yield quantitative; so the type of data needed should be determined before a method is chosen. At the same time, the source of the sample is important in determining the extraction and purification methods. Typical sample types include: urine, blood (plasma/serum) and mucosal secretions. Protein concentration within the sample is important, and one should expect reasonable extraction if the protein can be visualized on a coomassie blue stained gel (> 300 ng). The separation technique chosen should reflect the characteristics of the protein(s) of choice (hydrophobic vs hydrophilic, molecular mass, etc).

Another major factor in the planning process is estimating the difficulty in the preparation of the fractioned sample for mass spectrometry identification. Each mass spectrometry technique requires varying degrees of preparation, and some are much more complicated than others (2DE with MS/MS analysis requires greater preparation than HPLC with MS, for instance). Since mass spectrometry is often the step where a lot of proteomic studies encounter difficulty (both in preparation and in interpretation of the results), it is very important to choose a method that is appropriate for the protein sample.

With the advent of proteomic databases in recent years, bioinformatics has had an increasing presence in proteomic studies. For this reason, almost all proteomic studies should incorporate bioinformatics; and consequently it's important for the research team to have some bioinformatics knowledge. And depending on how much data will be received at the end of the study (depending on the analysis methods chosen), the research team can determine how much bioinformatic analysis should be needed.

A final factor to consider is whether to bring in outside assistance or to attempt the study in a more self-contained way. Keeping it self-contained allows for the research team to keep its data integrated and also keeps miscommunication to a minimum. Bringing in outside help, on the other hand, could allow a researcher to tackle problems that would be large and normally not solvable with a smaller team. While bringing in outside assistance seems promising, it's important to not lose control over the data and to make sure that the team is not spread out trying to accomplish more than it can handle.

Since there are many ways to study a cell's proteome, careful planning should be implemented at all stages of a proteomics study. Through broad-based proteomics, a researcher can define a test plan before any actual study is performed. And when used appropriately, this strategy can lead to productive and efficient projects that will bring science one step closer to understanding the proteome as a whole.

New Terms

A set of different proteins that form because of single nucleotide polymorphisms in the genomic sequence. ( )
Single-nucleotide polymorphism (SNP)
a DNA sequence variation occurring when a single nucleotide in the genome differs between members of a species. ( )
Post-translational modification (PTM)
the chemical modification of a protein after it has been translated. It is usualy one of the last steps in protein biosynthesis for most proteins. ( )
a subfractioned subset of the proteome. Often these are linked to area of the cell (organelle for instance) or by chemical properties.
Peptide mass fingerprinting (PMF)
an analytical technique for protein identification. The unknown protein of interest is first cleaved into smaller peptides and after mass is determined using mass spectrometry, their masses are compared to either a database containing known protein sequences or a genome. ( )

Course Relevance

This article is relevant because a global view of proteomics is becoming more important. As the wealth of information about proteins expands, understanding the proteome from a broad viewpoint is becoming more and more useful.

Websites Summarized

The Association of Bimolecular Resource Facilities: Proteomics Research Group (PRG)

Website committee: Pamela Scott Adams, Michelle Detwiler, David Mohr James Ee, Dr. Xiaolog Yang, Dr. Len Packman, Dr. Anthony Yeung, (3/25/09)

Main Focus

This web page is about how the Association of Bimolecular Resource Facilities relates to proteomics. Of particular importance is the Proteomics Research Group within the ABRF.


The Association of Bimolecular Resource Facilities (ABRF) is an international association of research facilities and laboratories that is focused on core research in Biotechnology. The association encourages the sharing of information through conferences, a quarterly journal, and group studies. The ABRF has a heavy influence on the field of proteomics, and there are five main research groups (RG) that deal with proteomics in some way: Protein Expression (PERG), Protein Sequencing (PSRG), Protein Informatics (iPRG), Proteomics (PRG), and Proteomics Standards (sPRG).

Of particular importance, the Proteomics Research Group allows for researchers throughout the world in the field of proteomics to share their protein analysis information freely. Obviously, since understanding the proteome is about bringing together information on many different proteins (which is information that requires a great amount of effort/time/money to achieve), the sharing of protein/subproteomic information is imperative to beginning to understand a proteome in its entirety. This website has numerous links to studies performed by research groups throughout the world.

New Terms

De Novo Peptide Sequencing
Peptide sequencing that is performed without any prior knowledge of the amino acid sequence. (
Quantitative Proteomics
Has the goal of obtaining quantitative information about all the proteins in a particular sample. This is useful because it allows for one to see the differences in protein samples. (

Course Relevance

This is an overview of the Association of Biomolecular Resource Facilities (ABRG) and how it relates to proteomics. There is a great deal of relevant information on this website that those in proteomics will find useful.

Introduction to Proteomics

Writer/Producer: Rick Groleau,Subject Matter Expert: Hanno Steen, PhD,Designer: Peggy Recinos,Developer: Jeffrey Testa, (28 March 2009)


Main Focus

This web page is about the importance and challenges in proteomics. It also introduces major steps of proteomics briefly.


Proteomics is important for us to understand biological processes since all the functions are accomplished by proteins in cell.But as the number of proteins are so large and amino acids(which are units of protein) are so small, the study is quite challenging.There are five steps to analyze protein sequences: sample preparation,separation,ionization,mass spectrometry and informatics.First of all, we obtain cells and extract proteins from the cells.Then we use methods such as 2D electrophoresis to separate proteins. Next, we use protease to cut proteins into peptides.Mass spectrometry allows us to identify individual peptides as well as peptides fragments.Finally, by interpreting the data, we are able to determine the sequence of proteins.

New Terms

A biopsy is a medical test involving the removal of cells or tissues for examination. It is the removal of tissue from a living subject to determine the presence or extent of a disease(
The time of flight (TOF) describes the method used to measure the time that it takes for a particle, object or stream to reach a detector while traveling over a known distance(
Quadrupole mass spectrometry
The quadrupole mass analyzer is one type of mass analyzer used in mass spectrometry.It consists of 4 circular rods, set perfectly parallel to each other.In a quadrupole mass spectrometer the quadrupole mass analyzer is the component of the instrument responsible for filtering sample ions, based on their mass-to-charge ratio (m/z).Ions are separated in a quadrupole based on the stability of their trajectories in the oscillating electric fields that are applied to the rods(
Electronspray ionization
Electrospray ionization (ESI) is a technique used in mass spectrometry to produce ions.It is especially useful in producing ions from macromolecules because it overcomes the propensity of these molecules to fragment when ionized(
Dalton is the unit of measurement for atomic mass. One Dalton is equal to 1/12th the mass of one atom of carbon12(

Course Relevance

This is an overview of proteomics. It summarizes the procedures and importance of proteomics very briefly.

Introduction to Proteomics

Institute of Biology and Medical Genetics of the First Faculty of Medicine of Charles University and the General Teaching Hospital, (6 April 2009)


Main Focus

This website discusses the aims and definitions of proteomics. It also introduces two important methods in proteomcis studies - 2D protein electrophoresis and mass spectrometry as well as proteomics in medicine


Proteomics is a broad field which includes expression proteomics, protein distribution in subcellular compartments of the organelles,post-translational modifications of the proteins,structural proteomics and functional proteomics, clinical proteomics and so on. Even though analysis of the expression on transcripts level is possible with the introduction of RNA/cDNA microarray, proteomics is still important since not all mRNA will be translated and the processes such as RNA splicing, posttranslational protein modifications exist.

Two-dimensional (2D) protein electrophoresis is commonly used to separate proteins based on their PI and mass. Mass spectrometry is an important method in proteomics since it cannot only be used for protein identification but can also be used for protein posttranslational modification analysis.

One of the major application of proteomics in medicine is the identification of markers in all the steps to treat diseases. Other applications include drug discovery and pharmacoproteomics.

New Terms

Human Proteome Organization(HUPO)
The Human Proteome Organisation (HUPO) is an international scientific organization representing and promoting proteomics through international cooperation and collaborations by fostering the development of new technologies, techniques and training(
structural proteomics
Structural proteomics is an international collaboration project for solving 3D protein structures at a proteome scale(
Swedish Human Protein Atlas
The Swedish Human Protein Atlas program (HPA), funded by the (non-profit) Knut and Alice Wallenberg Foundation, invites submission of antibodies from both academic and commercial sources to be included in the human protein atlas (
Posttranslational modification (PTM)
Posttranslational modification (PTM) is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis for many proteins(
Isoelectric point
Isoelectric point is such a pH value, where the overall protein charge equals to zero(

Course Relevance

This website gives brief definition and aims of proteomics.It also introduces principles of 2D-electrophoresis and mass spectrometry which are important methods in proteomics.