The tectonic collision of biology with separation science, MS, and informatics occurred over the past 15 years and was driven by contributions from more than 100 laboratories. Like budding yeast, MS is sprouting emergent approaches for the direct profiling and MS/MS analysis of heterogeneous proteins in ever more complex mixtures. Such approaches promise to determine molecular indicators of complex diseases and deepen our understanding of dynamic regulatory mechanisms in cell biology.
From The Top
Before it had a name, proteomics used 2-D gels to fractionate a complex cellular lysate into intact protein spots visualized by staining (Figure 1, right). This “top-down” molecular perspective focused largely on intact protein molecules (albeit at low chemical resolution) expressed by cells and revealed many, though certainly not all, analytical targets for identification. As methods improved and were combined with genome sequencing, a large-scale understanding of protein heterogeneity emerged along with the realization that multiple protein products can come from a single gene (Figure 1). The importance of this theme in higher organisms has only grown as more minds wrap around the implications of the Human Genome Project. Both the number of proteins modified and the number of modifications per protein increase in multicellular organisms versus typical bacteria or extremophiles. For eukarya such as humans, the main sources of protein heterogeneity are highly similar genes (gene families); coding polymorphisms (different amino acids among individuals in a population); variable processing of messenger RNA; and posttranslational modifications (PTMs), which can involve proteolytic trimming or decoration with any of more than 100 known chemical groups (1). It is this very heterogeneity that is so insidiously difficult to measure, yet it is required for changes in protein subcellular location, complexation, degradation, signal transduction, and regulatory control of enzymatic function. These biological events also change the molecular weight of intact proteins.
As impatient proponents of systems biology meet with technological roadblocks, developers are increasingly motivated to improve PTM analysis, including those developers who focus on the top-down approach to protein analysis (Figure 2b). Interest lies in increasing both the efficiency of protein identification (knowing which gene encodes the protein in question) and the characterization of protein primary structures (including PTMs). To improve efficiency and characterization, either specific PTMs are targeted for detection or a high percentage of “sequence coverage” is obtained. Sequence coverage simply means accurately measuring a mass that is or is not consistent with the underlying chemical composition (e.g., a predicted DNA sequence). Sequencing by MS requires cleaving between every backbone position to provide a mass ladder (composed of mass differences between fragment ions) and allows sequence determination de novo.