Design and Analysis of Quantitative Differential Proteomics Investigations Using LC-MS Technology

Liquid chromatography-mass spectrometry (LC-MS)-based proteomics is becoming an increasingly important tool in characterizing the abundance of proteins in biological samples of various types and across conditions. Effects of disease or drug treatments on protein abundance are of particular interest for the characterization of biological processes and the identification of biomarkers. Although state-of-the-art instrumentation is available to make high-quality measurements and commercially available software is available to process the data, the complexity of the technology and data presents challenges for bioinformaticians and statisticians. Here, we describe a pipeline for the analysis of quantitative LC-MS data. Key components of this pipeline include experimental design (sample pooling, blocking, and randomization) as well as deconvolution and alignment of mass chromatograms to generate a matrix of molecular abundance profiles. An important challenge in LC-MS-based quantitation is to be able to accurately identify and assign abundance measurements to members of protein families. To address this issue, we implement a novel statistical method for inferring the relative abundance of related members of protein families from tryptic peptide intensities. This pipeline has been used to analyze quantitative LC-MS data from multiple biomarker discovery projects. We illustrate our pipeline here with examples from two of these studies, and show that the pipeline constitutes a complete workable framework for LC-MS-based differential quantitation. Supplementary material is available at http://iec01.mie.utoronto.ca/~thodoros/Bukhman/.

[1]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[2]  S. Carr,et al.  Reporting Protein Identification Data , 2006, Molecular & Cellular Proteomics.

[3]  Biaoyang Lin,et al.  Proteins Associated with Cisplatin Resistance in Ovarian Cancer Cells Identified by Quantitative Proteomic Technology and Integrated with mRNA Expression Levels*S , 2006, Molecular & Cellular Proteomics.

[4]  F. McLafferty,et al.  Automated reduction and interpretation of , 2000, Journal of the American Society for Mass Spectrometry.

[5]  Zhangcheng Tang,et al.  Comparative Proteomic Analysis Provides New Insights into Chilling Stress Responses in Rice* , 2006, Molecular & Cellular Proteomics.

[6]  K P Hummel,et al.  Diabetes, a New Mutafton in the Mouse , 1966, Science.

[7]  P. Kearney,et al.  Bioinformatics Meets Proteomics - Bridging the Gap between Massspectrometry Data Analysis and Cell Biology , 2003, J. Bioinform. Comput. Biol..

[8]  Matthias Mann,et al.  HysTag—A Novel Proteomic Quantification Tool Applied to Differential Display Analysis of Membrane Proteins From Distinct Areas of Mouse Brain* , 2004, Molecular & Cellular Proteomics.

[9]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[10]  I. Gromova,et al.  Apocrine Cysts of the Breast , 2006, Molecular & Cellular Proteomics.

[11]  M. Mann,et al.  Exponentially Modified Protein Abundance Index (emPAI) for Estimation of Absolute Protein Amount in Proteomics by the Number of Sequenced Peptides per Protein*S , 2005, Molecular & Cellular Proteomics.

[12]  Yongyi Mao,et al.  Informatics Platform for Global Proteomic Profiling and Biomarker Discovery Using Liquid Chromatography-Tandem Mass Spectrometry*S , 2004, Molecular & Cellular Proteomics.

[13]  Moyez Dharsee,et al.  Differential analysis of membrane proteins in mouse fore- and hindbrain using a label-free approach. , 2006, Journal of proteome research.

[14]  T. Shaler,et al.  Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. , 2003, Analytical chemistry.

[15]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M Richard Simon,et al.  Design and Analysis of DNA Microarray Investigations , 2004 .

[17]  Jeffrey S. Morris,et al.  Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments , 2004, Bioinform..