Diamond: a multi-modal DIA mass spectrometry data processing pipeline

MOTIVATION We developed Diamond, a Nextflow-based, containerized, multi-modal data-independent acquisition (DIA) mass spectrometry (MS) data processing pipeline for peptide identification and quantification. Diamond integrated two mainstream workflows for DIA data analysis, namely, spectrum-centric scoring (SCS) and peptide-centric scoring (PCS), for use cases both with and without assay libraries. This multi-modal pipeline serves as a versatile, easy-to-use, and easily extendable toolbox for large-scale DIA data processing. AVAILABILITY The Docker image is available at https://hub.docker.com/r/zeroli/diamond and the source codes are freely accessible at https://github.com/xmuyulab/Diamond.

[1]  Ludovic C. Gillet,et al.  Data‐independent acquisition‐based SWATH‐MS for quantitative proteomics: a tutorial , 2018, Molecular systems biology.

[2]  Michael J MacCoss,et al.  Statistical control of peptide and protein error rates in large-scale targeted DIA analyses , 2017, Nature Methods.

[3]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[4]  J. Buhmann,et al.  Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry* , 2009, Molecular & Cellular Proteomics.

[5]  Ruedi Aebersold,et al.  Building consensus spectral libraries for peptide identification in proteomics , 2008, Nature Methods.

[6]  Ben C. Collins,et al.  Quantitative proteomics: challenges and opportunities in basic and applied research , 2017, Nature Protocols.

[7]  Samuel H Payne,et al.  PECAN: Library Free Peptide Detection for Data-Independent Acquisition Tandem Mass Spectrometry Data , 2017, Nature Methods.

[8]  Ben C. Collins,et al.  OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data , 2014, Nature Biotechnology.

[9]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[10]  Natalie I. Tasman,et al.  A guided tour of the Trans‐Proteomic Pipeline , 2010, Proteomics.

[11]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[12]  Christoph B. Messner,et al.  DIA-NN: Neural networks and interference correction enable deep proteome coverage in high throughput , 2019, Nature Methods.

[13]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[14]  J. Eng,et al.  Comet: An open‐source MS/MS sequence database search tool , 2013, Proteomics.

[15]  Oliver M. Bernhardt,et al.  Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues* , 2015, Molecular & Cellular Proteomics.

[16]  Brendan MacLean,et al.  Building high-quality assay libraries for targeted analysis of SWATH MS data , 2015, Nature Protocols.

[17]  Lars Malmström,et al.  TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics , 2016, Nature Methods.