Thousand and one ways to quantify and compare protein abundances in label-free bottom-up proteomics.

How to process and analyze MS data to quantify and statistically compare protein abundances in bottom-up proteomics has been an open debate for nearly fifteen years. Two main approaches are generally used: the first is based on spectral data generated during the process of identification (e.g. peptide counting, spectral counting), while the second makes use of extracted ion currents to quantify chromatographic peaks and infer protein abundances based on peptide quantification. These two approaches actually refer to multiple methods which have been developed during the last decade, but were submitted to deep evaluations only recently. In this paper, we compiled these different methods as exhaustively as possible. We also summarized the way they address the different problems raised by bottom-up protein quantification such as normalization, the presence of shared peptides, unequal peptide measurability and missing data. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock.

[1]  Benjamin Thomas,et al.  Comparative evaluation of label‐free SINQ normalized spectral index quantitation in the central proteomics facilities pipeline , 2011, Proteomics.

[2]  Tomi Suomi,et al.  Optimization of Statistical Methods Impact on Quantitative Proteomics Data. , 2015, Journal of proteome research.

[3]  Martin Eisenacher,et al.  Peek a peak: a glance at statistics for quantitative label-free proteomics , 2010, Expert review of proteomics.

[4]  G. Glish,et al.  Hybrid mass spectrometers for tandem mass spectrometry , 2008, Journal of the American Society for Mass Spectrometry.

[5]  Y. Levin,et al.  MS1-based label-free proteomics using a quadrupole orbitrap mass spectrometer. , 2015, Journal of proteome research.

[6]  William Stafford Noble,et al.  Estimating relative abundances of proteins from shotgun proteomics data , 2012, BMC Bioinformatics.

[7]  A. Link,et al.  Cluster Analysis of Mass Spectrometry Data Reveals a Novel Component of SAGA , 2004, Molecular and Cellular Biology.

[8]  D. Chelius,et al.  Quantitative profiling of proteins in complex mixtures using liquid chromatography and mass spectrometry. , 2002, Journal of proteome research.

[9]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[10]  Xiaoyun Fu,et al.  Spectral index for assessment of differential protein expression in shotgun proteomics. , 2008, Journal of proteome research.

[11]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[12]  Fuchu He,et al.  Modified spectral count index (mSCI) for estimation of protein abundance by protein relative identification possibility (RIPpro): a new proteomic technological parameter. , 2009, Journal of proteome research.

[13]  J. Koziol,et al.  Label-free, normalized quantification of complex mass spectrometry data for proteomics analysis , 2009, Nature Biotechnology.

[14]  Manuel Mayr,et al.  Comparative analysis of statistical methods used for detecting differential expression in label-free mass spectrometry proteomics. , 2015, Journal of proteomics.

[15]  Ruedi Aebersold,et al.  Options and considerations when selecting a quantitative proteomics strategy , 2010, Nature Biotechnology.

[16]  Richard D. Smith,et al.  DanteR: an extensible R-based tool for quantitative analysis of -omics data , 2012, Bioinform..

[17]  Richard D Smith,et al.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. , 2015, Journal of proteome research.

[18]  Olli Nevalainen,et al.  Cross-correlation of spectral count ranking to validate quantitative proteome measurements. , 2014, Journal of proteome research.

[19]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[20]  B. Searle Scaffold: A bioinformatic tool for validating MS/MS‐based proteomic studies , 2010, Proteomics.

[21]  Lars Malmström,et al.  aLFQ: an R-package for estimating absolute protein quantities from label-free LC-MS/MS proteomics data , 2014, Bioinform..

[22]  David L Tabb,et al.  IDPQuantify: combining precursor intensity with spectral counts for protein and peptide quantification. , 2013, Journal of proteome research.

[23]  M. Gorenstein,et al.  Absolute Quantification of Proteins by LCMSE , 2006, Molecular & Cellular Proteomics.

[24]  Knut Reinert,et al.  Tools for Label-free Peptide Quantification , 2012, Molecular & Cellular Proteomics.

[25]  Jonas Grossmann,et al.  Implementation and evaluation of relative and absolute quantification in shotgun proteomics with label-free methods. , 2010, Journal of proteomics.

[26]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[27]  Gunther Schadow,et al.  Protein quantification in label-free LC-MS experiments. , 2009, Journal of proteome research.

[28]  J. Yates,et al.  Protein analysis by shotgun/bottom-up proteomics. , 2013, Chemical reviews.

[29]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[30]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[31]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[32]  K. Gevaert,et al.  RIBAR and xRIBAR: Methods for reproducible relative MS/MS-based label-free protein quantification. , 2011, Journal of proteome research.

[33]  Ruedi Aebersold,et al.  Statistical Approach to Protein Quantification* , 2013, Molecular & Cellular Proteomics.

[34]  Connie R. Jimenez,et al.  On the beta-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics , 2010, Bioinform..

[35]  Edward M Marcotte,et al.  Label-Free Protein Quantitation Using Weighted Spectral Counting , 2012, Quantitative Methods in Proteomics.

[36]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[37]  Andrea Splendiani,et al.  A power law global error model for the identification of differentially expressed genes in microarray data , 2004, BMC Bioinformatics.

[38]  Peter Chu,et al.  Design and Analysis of Quantitative Differential Proteomics Investigations Using LC-MS Technology , 2008, J. Bioinform. Comput. Biol..

[39]  J. X. Pang,et al.  Biomarker discovery in urine by proteomics. , 2002, Journal of proteome research.

[40]  Alexey I Nesvizhskii,et al.  Abacus: A computational tool for extracting and pre‐processing spectral count data for label‐free quantitative proteomic analysis , 2011, Proteomics.

[41]  William Stafford Noble,et al.  Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis , 2014, Journal of proteome research.

[42]  Rong Wang,et al.  The APEX Quantitative Proteomics Tool: Generating protein quantitation estimates from LC-MS/MS proteomics results , 2008, BMC Bioinformatics.

[43]  A. Nesvizhskii,et al.  Comparative analysis of different label-free mass spectrometry based protein abundance estimates and their correlation with RNA-Seq gene expression data. , 2012, Journal of proteome research.

[44]  Marco Y. Hein,et al.  Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ * , 2014, Molecular & Cellular Proteomics.

[45]  B. Futcher,et al.  A Sampling of the Yeast Proteome , 1999, Molecular and Cellular Biology.

[46]  Predrag Radivojac,et al.  Computational approaches to protein inference in shotgun proteomics , 2012, BMC Bioinformatics.

[47]  J. Prenni,et al.  Improved detection of quantitative differences using a combination of spectral counting and MS/MS total ion current. , 2013, Journal of proteome research.

[48]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[49]  E. Marcotte,et al.  Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation , 2007, Nature Biotechnology.

[50]  Ying Zhang,et al.  Effect of dynamic exclusion duration on spectral count based quantitative proteomics. , 2009, Analytical chemistry.

[51]  K. Valgepea,et al.  Comparison and applications of label-free absolute proteome quantification methods on Escherichia coli. , 2012, Journal of proteomics.

[52]  Adam M. Hawkridge CHAPTER 1:Practical Considerations and Current Limitations in Quantitative Mass Spectrometry-based Proteomics , 2014 .

[53]  V. Loux,et al.  Characterization of the insoluble proteome of Lactococcus lactis by SDS-PAGE LC-MS/MS leads to the identification of new markers of adaptation of the bacteria to the mouse digestive tract. , 2010, Journal of proteome research.

[54]  K. Anderson,et al.  Mixed-effects statistical model for comparative LC-MS proteomics studies. , 2008, Journal of proteome research.

[55]  Jae K. Lee,et al.  Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays , 2003, Bioinform..

[56]  J K Lee,et al.  Analysis issues for gene expression array data. , 2001, Clinical chemistry.

[57]  Linfeng Wu,et al.  Role of spectral counting in quantitative proteomics , 2010, Expert review of proteomics.

[58]  Jae K Lee,et al.  Resasc: a Resampling-based Algorithm to Determine Differential Protein Expression from Spectral Count Data , 2009 .

[59]  Norman Pavelka,et al.  Statistical Similarities between Transcriptomics and Quantitative Shotgun Proteomics Data *S , 2008, Molecular & Cellular Proteomics.

[60]  Joel G Pounds,et al.  A comparative analysis of computational approaches to relative protein quantification using peptide peak intensities in label‐free LC‐MS proteomics experiments , 2013, Proteomics.

[61]  Pei Wang,et al.  Analyzing LC-MS/MS data by spectral count and ion abundance: two case studies. , 2011, Statistics and its interface.

[62]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[63]  Lee H. Dicker,et al.  Increased Power for the Analysis of Label-free LC-MS/MS Proteomics Data by Combining Spectral Counts and Peptide Peak Attributes* , 2010, Molecular & Cellular Proteomics.

[64]  R D Voyksner,et al.  Investigating the use of an octupole ion guide for ion storage and high-pass mass filtering to improve the quantitative performance of electrospray ion trap mass spectrometry. , 1999, Rapid communications in mass spectrometry : RCM.

[65]  Masaru Tomita,et al.  emPAI Calc - for the estimation of protein abundance from large-scale identification data by liquid chromatography-tandem mass spectrometry , 2010, Bioinform..

[66]  Markus Müller,et al.  Processing strategies and software solutions for data‐independent acquisition in mass spectrometry , 2015, Proteomics.

[67]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[68]  Lennart Martens,et al.  A comparison of MS2‐based label‐free quantitative proteomic techniques with regards to accuracy and precision , 2011, Proteomics.

[69]  Richard D. Smith,et al.  Detecting differential protein expression in large-scale population proteomics , 2014, Bioinform..

[70]  Sylvie Huet,et al.  Including shared peptides for estimating protein abundances: A significant improvement for quantitative proteomics , 2012, Proteomics.

[71]  Olivier Langella,et al.  MassChroQ: A versatile tool for mass spectrometry quantification , 2011, Proteomics.

[72]  Richard E Higgs,et al.  Comprehensive label-free method for the relative quantification of proteins from biological samples. , 2005, Journal of proteome research.

[73]  M. Mann,et al.  Large-scale Proteomic Analysis of the Human Spliceosome References , 2006 .

[74]  N. Samatova,et al.  Detecting differential and correlated protein expression in label-free shotgun proteomics. , 2006, Journal of proteome research.

[75]  Mark S Friedrichs,et al.  Changes in the protein expression of yeast as a function of carbon source. , 2003, Journal of proteome research.

[76]  Daniel C. Liebler,et al.  Comparative Shotgun Proteomics Using Spectral Count Data and Quasi-Likelihood Modeling , 2010, Journal of proteome research.

[77]  Hyungwon Choi,et al.  QPROT: Statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics. , 2015, Journal of proteomics.

[78]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[79]  Jacob D. Jaffe,et al.  PEPPeR, a Platform for Experimental Proteomic Pattern Recognition*S , 2006, Molecular & Cellular Proteomics.

[80]  Jianhua Huang,et al.  A statistical framework for protein quantitation in bottom-up MS-based proteomics , 2009, Bioinform..

[81]  Mehdi Mirzaei,et al.  Less label, more free: Approaches in label‐free quantitative mass spectrometry , 2011, Proteomics.

[82]  M. Washburn,et al.  Refinements to label free proteome quantitation: how to deal with peptides shared by multiple proteins. , 2010, Analytical chemistry.

[83]  M. Mann,et al.  Exponentially Modified Protein Abundance Index (emPAI) for Estimation of Absolute Protein Amount in Proteomics by the Number of Sequenced Peptides per Protein*S , 2005, Molecular & Cellular Proteomics.

[84]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[85]  Michael K. Coleman,et al.  Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. , 2006, Journal of proteome research.

[86]  Neil Hall,et al.  Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry , 2002, Nature.

[87]  R. MacLean Predicting epistasis: an experimental test of metabolic control theory with bacterial transcription and translation , 2010, Journal of evolutionary biology.

[88]  Lihua Zhang,et al.  NSI and NSMT: usages of MS/MS fragment ion intensity for sensitive differential proteome detection and accurate protein fold change calculation in relative label-free proteome quantification. , 2012, The Analyst.

[89]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[90]  Hyungwon Choi,et al.  Significance Analysis of Spectral Count Data in Label-free Shotgun Proteomics*S , 2008, Molecular & Cellular Proteomics.

[91]  Alexander Schmidt,et al.  Critical assessment of proteome‐wide label‐free absolute abundance estimation strategies , 2013, Proteomics.

[92]  Haiyuan Yu,et al.  A Bayesian Mixture Model for Comparative Spectral Count Data in Shotgun Proteomics , 2011, Molecular & Cellular Proteomics.

[93]  Valmir Carneiro Barbosa,et al.  PatternLab for proteomics: a tool for differential shotgun proteomics , 2008, BMC Bioinformatics.

[94]  Steven P Gygi,et al.  Semiquantitative Proteomic Analysis of Rat Forebrain Postsynaptic Density Fractions by Mass Spectrometry* , 2004, Journal of Biological Chemistry.

[95]  Navdeep Jaitly,et al.  DAnTE: a statistical tool for quantitative analysis of -omics data , 2008, Bioinform..

[96]  Dongseok Choi,et al.  Accurate label-free protein quantitation with high- and low-resolution mass spectrometers. , 2014, Journal of proteome research.

[97]  Student,et al.  ON THE ERROR OF COUNTING WITH A HAEMACYTOMETER , 1907 .

[98]  Ruedi Aebersold,et al.  Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs , 2012, BMC Bioinformatics.

[99]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[100]  Quanhu Sheng,et al.  Systematic Assessment of Survey Scan and MS2-Based Abundance Strategies for Label-Free Quantitative Proteomics Using High-Resolution MS Data , 2014, Journal of proteome research.

[101]  H. Christofk,et al.  A label‐free quantification method by MS/MS TIC compared to SILAC and spectral counting in a proteomics screen , 2008, Proteomics.

[102]  Vineet Bafna,et al.  Accurate Mass Spectrometry Based Protein Quantification via Shared Peptides , 2012, J. Comput. Biol..

[103]  N. L. Heinecke,et al.  PepC: proteomics software for identifying differentially expressed proteins based on spectral counting , 2010, Bioinform..

[104]  L.L. Elo,et al.  Reproducibility-Optimized Test Statistic for Ranking Genes in Microarray Studies , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[105]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[106]  J. Claverie,et al.  The significance of digital gene expression profiles. , 1997, Genome research.

[107]  Hao Jiang,et al.  Improved accuracy for label-free absolute quantification of proteome by combining the Absolute Protein EXpression profiling algorithm and summed tandem mass spectrometric total ion current. , 2014, The Analyst.