A simple peak detection and label-free quantitation algorithm for chromatography-mass spectrometry

BackgroundLabel-free quantitation of mass spectrometric data is one of the simplest and least expensive methods for differential expression profiling of proteins and metabolites. The need for high accuracy and performance computational label-free quantitation methods is still high in the biomarker and drug discovery research field. However, recent most advanced types of LC-MS generate huge amounts of analytical data with high scan speed, high accuracy and resolution, which is often impossible to interpret manually. Moreover, there are still issues to be improved for recent label-free methods, such as how to reduce false positive/negatives of the candidate peaks, how to expand scalability and how to enhance and automate data processing. AB3D (A simple label-free quantitation algorithm for Biomarker Discovery in Diagnostics and Drug discovery using LC-MS) has addressed these issues and has the capability to perform label-free quantitation using MS1 for proteomics study.ResultsWe developed an algorithm called AB3D, a label free peak detection and quantitative algorithm using MS1 spectral data. To test our algorithm, practical applications of AB3D for LC-MS data sets were evaluated using 3 datasets. Comparisons were then carried out between widely used software tools such as MZmine 2, MSight, SuperHirn, OpenMS and our algorithm AB3D, using the same LC-MS datasets. All quantitative results were confirmed manually, and we found that AB3D could properly identify and quantify known peptides with fewer false positives and false negatives compared to four other existing software tools using either the standard peptide mixture or the real complex biological samples of Bartonella quintana (strain JK31). Moreover, AB3D showed the best reliability by comparing the variability between two technical replicates using a complex peptide mixture of HeLa and BSA samples. For performance, the AB3D algorithm is about 1.2 - 15 times faster than the four other existing software tools.ConclusionsAB3D is a simple and fast algorithm for label-free quantitation using MS1 mass spectrometry data for large scale LC-MS data analysis with higher true positive and reasonable false positive rates. Furthermore, AB3D demonstrated the best reproducibility and is about 1.2- 15 times faster than those of existing 4 software tools.

[1]  Masaru Tomita,et al.  Unbiased Quantitation of Escherichia coli Membrane Proteome Using Phase Transfer Surfactants* , 2009, Molecular & Cellular Proteomics.

[2]  A. Beckett,et al.  AKUFO AND IBARAPA. , 1965, Lancet.

[3]  Jeffrey W. Smith,et al.  Mass Spectrometry-Based Label-Free Quantitative Proteomics , 2009, Journal of biomedicine & biotechnology.

[4]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[5]  Yuichiro Fujita,et al.  Mass++: A Visualization and Analysis Tool for Mass Spectrometry. , 2014, Journal of proteome research.

[6]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[7]  Ruedi Aebersold,et al.  A Software Suite for the Generation and Comparison of Peptide Arrays from Sets of Data Collected by Liquid Chromatography-Mass Spectrometry*S , 2005, Molecular & Cellular Proteomics.

[8]  Steven P Gygi,et al.  Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations , 2005, Nature Methods.

[9]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[10]  Jean-Charles Sanchez,et al.  MSight: An image analysis software for liquid chromatography‐mass spectrometry , 2005, Proteomics.

[11]  M. Mann,et al.  Stable Isotope Labeling by Amino Acids in Cell Culture, SILAC, as a Simple and Accurate Approach to Expression Proteomics* , 2002, Molecular & Cellular Proteomics.

[12]  C. A. Hastings,et al.  New algorithms for processing and peak detection in liquid chromatography/mass spectrometry data. , 2002, Rapid communications in mass spectrometry : RCM.

[13]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[14]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[15]  M. Mann,et al.  Exponentially Modified Protein Abundance Index (emPAI) for Estimation of Absolute Protein Amount in Proteomics by the Number of Sequenced Peptides per Protein*S , 2005, Molecular & Cellular Proteomics.

[16]  Y. Oda,et al.  Quantitative profiling of polar cationic metabolites in human cerebrospinal fluid by reversed-phase nanoliquid chromatography/mass spectrometry. , 2009, Analytical chemistry.

[17]  Ken Aoshima,et al.  Quantitative phosphorus metabolomics using nanoflow liquid chromatography-tandem mass spectrometry and culture-derived comprehensive global internal standards. , 2009, Analytical chemistry.

[18]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[19]  F. Cross,et al.  Accurate quantitation of protein expression and site-specific phosphorylation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Pei Wang,et al.  Bioinformatics Original Paper a Suite of Algorithms for the Comprehensive Analysis of Complex Protein Mixtures Using High-resolution Lc-ms , 2022 .

[21]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[22]  Lukas N. Mueller,et al.  An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. , 2008, Journal of proteome research.

[23]  Lukas N. Mueller,et al.  SuperHirn – a novel tool for high resolution LC‐MS‐based peptide/protein profiling , 2007, Proteomics.

[24]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[25]  R D Appel,et al.  Improving protein identification from peptide mass fingerprinting through a parameterized multi‐level scoring algorithm and an optimized peak detection , 1999, Electrophoresis.

[26]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[27]  Giovanni Cuda,et al.  Shotgun proteomic analysis of two Bartonella quintana strains , 2013, Proteomics.

[28]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[29]  Ron D Appel,et al.  Proteome informatics I: Bioinformatics tools for processing experimental data , 2006, Proteomics.