IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled Match-Between-Runs

Missing values weaken the power of label-free quantitative proteomic experiments to uncover true quantitative differences between biological samples or experimental conditions. Match-between-runs (MBR) has become a common approach to mitigate the missing value problem, where peptides identified by tandem mass spectra in one run are transferred to another by inference based on m/z, charge state, retention time, and ion mobility when applicable. Though tolerances are used to ensure such transferred identifications are reasonably located and meet certain quality thresholds, little work has been done to evaluate the statistical confidence of MBR. Here, we present a mixture model-based approach to estimate the false discovery rate (FDR) of peptide and protein identification transfer, which we implement in the label-free quantification tool IonQuant. Using several benchmarking datasets generated on both Orbitrap and timsTOF mass spectrometers, we demonstrate that IonQuant with FDR-controlled MBR results in superior performance compared to MaxQuant. We further illustrate the need for FDR-controlled MBR in sparse datasets such as those from single-cell proteomics experiments.

[1]  Michael J MacCoss,et al.  Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry , 2018, Nature Communications.

[2]  Matthew The,et al.  Focus on the spectra that matter by clustering of quantification data in shotgun proteomics , 2020, Nature Communications.

[3]  Melvin A. Park,et al.  Online Parallel Accumulation–Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer* , 2018, Molecular & Cellular Proteomics.

[4]  Guo Ci Teo,et al.  Fast quantitative analysis of timsTOF PASEF data with MSFragger and IonQuant , 2020, bioRxiv.

[5]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[6]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[7]  Chih-Chiang Tsou,et al.  IDEAL-Q, an Automated Tool for Label-free Quantitation Analysis Using an Efficient Peptide Alignment Approach and Spectral Data Validation* , 2009, Molecular & Cellular Proteomics.

[8]  Michael R. Shortreed,et al.  Ultrafast Peptide Label-Free Quantification with FlashLFQ. , 2018, Journal of proteome research.

[9]  Edmond J. Breen,et al.  Automatic Poisson peak harvesting for high throughput protein identification , 2000, Electrophoresis.

[10]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[11]  Ruedi Aebersold,et al.  SnapShot: Mass Spectrometry for Protein and Proteome Analyses , 2013, Cell.

[12]  Lukas N. Mueller,et al.  SuperHirn – a novel tool for high resolution LC‐MS‐based peptide/protein profiling , 2007, Proteomics.

[13]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[14]  Steven P. Gygi,et al.  Evaluating False Transfer Rates from the Match-Between-Runs Algorithm with a Two-Proteome Model. , 2019, Journal of proteome research.

[15]  M. Mann,et al.  Deep Proteomics of Mouse Skeletal Muscle Enables Quantitation of Protein Isoforms, Metabolic Pathways, and Transcription Factors* , 2015, Molecular & Cellular Proteomics.

[16]  The UniProt Consortium,et al.  UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..

[17]  S. Shen-Orr,et al.  Social network architecture of human immune cells unveiled by quantitative proteomics , 2017, Nature Immunology.

[18]  Marco Y. Hein,et al.  Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ * , 2014, Molecular & Cellular Proteomics.

[19]  Alan R. Dabney,et al.  A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics. , 2011, Analytical chemistry.

[20]  Alexey I. Nesvizhskii,et al.  Philosopher: a versatile toolkit for shotgun proteomics data analysis , 2020, Nature Methods.

[21]  Fengchao Yu,et al.  Identification of modified peptides using localization-aware open search , 2020, Nature Communications.

[22]  Alexey I Nesvizhskii,et al.  Untargeted, spectral library‐free analysis of data‐independent acquisition proteomics data generated using Orbitrap mass spectrometers , 2016, Proteomics.

[23]  Ludovic C. Gillet,et al.  Data‐independent acquisition‐based SWATH‐MS for quantitative proteomics: a tutorial , 2018, Molecular systems biology.

[24]  Jüergen Cox,et al.  The MaxQuant computational platform for mass spectrometry-based shotgun proteomics , 2016, Nature Protocols.

[25]  Roman Fischer,et al.  MaxQuant Software for Ion Mobility Enhanced Shotgun Proteomics* , 2019, Molecular & Cellular Proteomics.

[26]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[27]  Alexey I Nesvizhskii,et al.  Analysis and validation of proteomic data generated by tandem mass spectrometry , 2007, Nature Methods.

[28]  Peter Gärdenfors,et al.  Proceedings of the international conference on Spatial Cognition VI: Learning, Reasoning, and Talking about Space , 2008 .

[29]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[30]  Matthew The,et al.  Focus on the spectra that matter by clustering of quantification data in shotgun proteomics , 2018, bioRxiv.

[31]  Ying Zhu,et al.  Automated Coupling of Nanodroplet Sample Preparation with Liquid Chromatography-Mass Spectrometry for High-Throughput Single-Cell Proteomics. , 2020, Analytical chemistry.

[32]  Olga Vitek,et al.  A statistical model-building perspective to identification of MS/MS spectra with PeptideProphet , 2012, BMC Bioinformatics.

[33]  Hyungwon Choi,et al.  Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. , 2008, Journal of proteome research.

[34]  Yasset Perez-Riverol,et al.  A multi-center study benchmarks software tools for label-free proteome quantification , 2016, Nature Biotechnology.

[35]  T. Rejtar,et al.  A new algorithm using cross-assignment for label-free quantitation with LC-LTQ-FT MS. , 2007, Journal of proteome research.

[36]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[37]  C. Freksa Spatial Cognition VI. Learning, Reasoning, and Talking about Space, International Conference Spatial Cognition 2008, Freiburg, Germany, September 15-19, 2008. Proceedings , 2008, Spatial Cognition.

[38]  Brendan MacLean,et al.  MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments , 2014, Bioinform..

[39]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[40]  Richard D. Smith,et al.  Advances in proteomics data analysis and display using an accurate mass and time tag approach. , 2006, Mass spectrometry reviews.