Data‐Independent Acquisition Mass Spectrometry‐Based Proteomics and Software Tools: A Glimpse in 2020

This review provides a brief overview of the development of data‐independent acquisition (DIA) mass spectrometry‐based proteomics and selected DIA data analysis tools. Various DIA acquisition schemes for proteomics are summarized first including Shotgun‐CID, DIA, MSE, PAcIFIC, AIF, SWATH, MSX, SONAR, WiSIM, BoxCar, Scanning SWATH, diaPASEF, and PulseDIA, as well as the mass spectrometers enabling these methods. Next, the software tools for DIA data analysis are classified into three groups: library‐based tools, library‐free tools, and statistical validation tools. The approaches are reviewed for generating spectral libraries for six selected library‐based DIA data analysis software tools which are tested by the authors, including OpenSWATH, Spectronaut, Skyline, PeakView, DIA‐NN, and EncyclopeDIA. An increasing number of library‐free DIA data analysis tools are developed including DIA‐Umpire, Group‐DIA, PECAN, PEAKS, which facilitate identification of novel proteoforms. The authors share their user experience of when to use DIA‐MS, and several selected DIA data analysis software tools. Finally, the state of the art DIA mass spectrometry and software tools, and the authors’ views of future directions are summarized.

[1]  Roland Bruderer,et al.  Surpassing 10 000 identified and quantified proteins in a single run by optimizing current LC-MS instrumentation and data analysis strategy. , 2019, Molecular omics.

[2]  Chad R. Weisbrod,et al.  Accurate peptide fragment mass analysis: multiplexed peptide identification and quantification. , 2012, Journal of proteome research.

[3]  Oliver M. Bernhardt,et al.  Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues* , 2015, Molecular & Cellular Proteomics.

[4]  Michael J MacCoss,et al.  Thesaurus: quantifying phosphopeptide positional isomers , 2019, Nature Methods.

[5]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[6]  Lars Malmström,et al.  Efficient visualization of high-throughput targeted proteomics experiments: TAPIR , 2015, Bioinform..

[7]  Michael L. Gatza,et al.  Proteogenomics connects somatic mutations to signaling in breast cancer , 2016, Nature.

[8]  Jürgen Cox,et al.  MaxQuant.Live Enables Global Targeting of More Than 25,000 Peptides , 2018, Molecular & Cellular Proteomics.

[9]  Lars Malmström,et al.  Identification of a Set of Conserved Eukaryotic Internal Retention Time Standards for Data-independent Acquisition Mass Spectrometry* , 2015, Molecular & Cellular Proteomics.

[10]  Peter B. McGarvey,et al.  Proteogenomic Characterization of Endometrial Carcinoma , 2020, Cell.

[11]  R. Aebersold,et al.  Selected reaction monitoring for quantitative proteomics: a tutorial , 2008, Molecular systems biology.

[12]  Ludovic C. Gillet,et al.  Quantitative Proteome Landscape of the NCI-60 Cancer Cell Lines , 2019, iScience.

[13]  Tao Xu,et al.  Proteomic Study and Marker Protein Identification of Caenorhabditis elegans Lipid Droplets* , 2012, Molecular & Cellular Proteomics.

[14]  Sean L Seymour,et al.  The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra*S , 2007, Molecular & Cellular Proteomics.

[15]  Michael J MacCoss,et al.  Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry , 2018, Nature Communications.

[16]  D. Goodlett,et al.  Shotgun collision‐induced dissociation of peptides using a time of flight mass analyzer , 2003, Proteomics.

[17]  Ben C. Collins,et al.  OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data , 2014, Nature Biotechnology.

[18]  Dana Pascovici,et al.  SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries* , 2016, Molecular & Cellular Proteomics.

[19]  Ngoc Hieu Tran,et al.  Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry , 2018, Nature Methods.

[20]  Mathias Wilhelm,et al.  Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning , 2019, Nature Methods.

[21]  Nichole L. King,et al.  Development and validation of a spectral library searching method for peptide identification from MS/MS , 2007, Proteomics.

[22]  Loïc Dayon,et al.  Analysis of 1508 Plasma Samples by Capillary-Flow Data-Independent Acquisition Profiles Proteomics of Weight Loss and Maintenance , 2019, Molecular & Cellular Proteomics.

[23]  Oliver Fiehn,et al.  Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics. , 2016, Analytical chemistry.

[24]  M. Mann,et al.  Stable Isotope Labeling by Amino Acids in Cell Culture, SILAC, as a Simple and Accurate Approach to Expression Proteomics* , 2002, Molecular & Cellular Proteomics.

[25]  Nichole L. King,et al.  Human Plasma PeptideAtlas , 2005, Proteomics.

[26]  John D. Venable,et al.  Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra , 2004, Nature Methods.

[27]  R. Aebersold,et al.  mProphet: automated data processing and statistical validation for large-scale SRM experiments , 2011, Nature Methods.

[28]  Sarah J. Parker,et al.  Effect of peptide assay library size and composition in targeted data‐independent acquisition‐MS analyses , 2016, Proteomics.

[29]  Lars Malmström,et al.  DIANA - algorithmic improvements for analysis of data-independent acquisition MS data , 2015, Bioinform..

[30]  Ruedi Aebersold,et al.  Mass-spectrometric exploration of proteome structure and function , 2016, Nature.

[31]  Ruedi Aebersold,et al.  SWATH2stats: An R/Bioconductor Package to Process and Convert Quantitative SWATH-MS Proteomics Data for Downstream Analysis Tools , 2016, PloS one.

[32]  Lars Malmström,et al.  TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics , 2016, Nature Methods.

[33]  Rui Wu,et al.  Systematic assessment of the effect of internal library in targeted analysis of SWATH-MS. , 2019, Journal of proteome research.

[34]  B. Hammock,et al.  Mass spectrometry-based metabolomics. , 2007, Mass spectrometry reviews.

[35]  Alessandro Sette,et al.  An open-source computational and data resource to analyze digital maps of immunopeptidomes , 2015, eLife.

[36]  Ludovic C. Gillet,et al.  Data‐independent acquisition‐based SWATH‐MS for quantitative proteomics: a tutorial , 2018, Molecular systems biology.

[37]  Chao Liu,et al.  Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine , 2018, Nature Biotechnology.

[38]  Brendan MacLean,et al.  Bioinformatics Applications Note Gene Expression Skyline: an Open Source Document Editor for Creating and Analyzing Targeted Proteomics Experiments , 2022 .

[39]  Matthias Mann,et al.  BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes , 2018, Nature Methods.

[40]  Christian Panse,et al.  specL - an R/Bioconductor package to prepare peptide spectrum matches for use in targeted proteomics , 2015, Bioinform..

[41]  P. Brennan,et al.  Proteomics technologies for the global identification and quantification of proteins. , 2010, Advances in protein chemistry and structural biology.

[42]  Johannes P C Vissers,et al.  Scanning Quadrupole Data-Independent Acquisition, Part A: Qualitative and Quantitative Characterization. , 2017, Journal of proteome research.

[43]  Lindsay K. Pino,et al.  The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics. , 2020, Mass spectrometry reviews.

[44]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[45]  Jian Wang,et al.  MSPLIT-DIA: sensitive peptide identification for data-independent acquisition , 2015, Nature Methods.

[46]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[47]  Matthias Mann,et al.  Visualization of LC‐MS/MS proteomics data in MaxQuant , 2015, Proteomics.

[48]  Michael J MacCoss,et al.  Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics , 2018, Nature Methods.

[49]  Johannes Griss,et al.  Spectral library searching in proteomics , 2016, Proteomics.

[50]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[51]  Jarrett D. Egertson,et al.  Multiplexed MS/MS for Improved Data Independent Acquisition , 2013, Nature Methods.

[52]  Andrew Keller,et al.  Automated Validation of Results and Removal of Fragment Ion Interferences in Targeted Analysis of Data-independent Acquisition Mass Spectrometry (MS) using SWATHProphet* , 2015, Molecular & Cellular Proteomics.

[53]  John Chilton,et al.  Using iRT, a normalized retention time for more targeted measurement of peptides , 2012, Proteomics.

[54]  Ruedi Aebersold,et al.  Conserved Peptide Fragmentation as a Benchmarking Tool for Mass Spectrometers and a Discriminating Feature for Targeted Proteomics* , 2014, Molecular & Cellular Proteomics.

[55]  Mingwei Liu,et al.  A proteomic landscape of diffuse-type gastric cancer , 2018, Nature Communications.

[56]  Yasset Perez-Riverol,et al.  A multi-center study benchmarks software tools for label-free proteome quantification , 2016, Nature Biotechnology.

[57]  Alexey I Nesvizhskii,et al.  MSFragger: ultrafast and comprehensive peptide identification in shotgun proteomics , 2017, Nature Methods.

[58]  Jürgen Cox,et al.  High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis , 2019, Nature Methods.

[59]  I. Wilson,et al.  UPLC/MS(E); a new approach for generating molecular fragment information for biomarker structure elucidation. , 2006, Rapid communications in mass spectrometry : RCM.

[60]  Ka Wan Li,et al.  Comparative Analyses of Data Independent Acquisition Mass Spectrometric Approaches: DIA, WiSIM‐DIA, and Untargeted DIA , 2018, Proteomics.

[61]  Lars Malmström,et al.  Inference and quantification of peptidoforms in large sample cohorts by SWATH-MS , 2017, Nature Biotechnology.

[62]  Dana Pascovici,et al.  iSwathX: an interactive web-based application for extension of DIA peptide reference libraries , 2018, Bioinform..

[63]  M. Mann,et al.  Proteomics on an Orbitrap Benchtop Mass Spectrometer Using All-ion Fragmentation , 2010, Molecular & Cellular Proteomics.

[64]  Jürgen Cox,et al.  MaxQuant goes Linux , 2018, Nature Methods.

[65]  Eric W. Deutsch,et al.  A repository of assays to quantify 10,000 human proteins by SWATH-MS , 2014, Scientific Data.

[66]  Ludovic C. Gillet,et al.  Rapid mass spectrometric conversion of tissue biopsy samples into permanent quantitative digital proteome maps , 2015, Nature Medicine.

[67]  Bin Zhang,et al.  Deep Multilayer Brain Proteomics Identifies Molecular Networks in Alzheimer’s Disease Progression , 2020, Neuron.

[68]  Luis Mendoza,et al.  Trans‐Proteomic Pipeline, a standardized data processing pipeline for large‐scale reproducible proteomics informatics , 2015, Proteomics. Clinical applications.

[69]  Pavel A. Pevzner,et al.  Universal database search tool for proteomics , 2014, Nature Communications.

[70]  Lloyd M. Smith,et al.  Proteoform: a single term describing protein complexity , 2013, Nature Methods.

[71]  Derek J. Bailey,et al.  Parallel Reaction Monitoring for High Resolution and High Mass Accuracy Quantitative, Targeted Proteomics* , 2012, Molecular & Cellular Proteomics.

[72]  Chunjie Luo,et al.  pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning. , 2017, Analytical chemistry.

[73]  Hanno Steen,et al.  PIQED: automated identification and quantification of protein modifications from DIA-MS data , 2017, Nature Methods.

[74]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[75]  Tao Xu,et al.  Bioinformatics Applications Note Sequence Analysis Xdia: Improving on the Label-free Data-independent Analysis , 2022 .

[76]  Edward L. Huttlin,et al.  A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides , 2015, Nature Biotechnology.

[77]  Chris Sander,et al.  Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome , 2016, Cell.

[78]  Yuanyue Li,et al.  Group-DIA: analyzing multiple data-independent acquisition mass spectrometry data files , 2015, Nature Methods.

[79]  Mingwei Liu,et al.  Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma , 2019, Nature.

[80]  Samuel H Payne,et al.  PECAN: Library Free Peptide Detection for Data-Independent Acquisition Tandem Mass Spectrometry Data , 2017, Nature Methods.

[81]  Laurent Gatto,et al.  Improving qualitative and quantitative performance for MS(E)-based label-free proteomics. , 2013, Journal of proteome research.

[82]  Wenqing Shui,et al.  Optimization of Acquisition and Data-Processing Parameters for Improved Proteomic Quantification by Sequential Window Acquisition of All Theoretical Fragment Ion Mass Spectrometry. , 2017, Journal of proteome research.

[83]  William Stafford Noble,et al.  Speeding up Percolator. , 2019, Journal of proteome research.

[84]  Brendan MacLean,et al.  Building high-quality assay libraries for targeted analysis of SWATH MS data , 2015, Nature Protocols.

[85]  Michael J MacCoss,et al.  Statistical control of peptide and protein error rates in large-scale targeted DIA analyses , 2017, Nature Methods.

[86]  Christoph B. Messner,et al.  DIA-NN: Neural networks and interference correction enable deep proteome coverage in high throughput , 2019, Nature Methods.

[87]  Hao Chi,et al.  MS/MS Spectrum Prediction for Modified Peptides Using pDeep2 Trained by Transfer Learning. , 2019, Analytical chemistry.