The use of the isotopic distribution as a complementary quality metric to assess tandem mass spectra results.

UNLABELLED Shotgun proteomics is a powerful technology to study the protein population of a biological system. This approach employs tandem mass spectrometry for amino acid sequencing. Fragmented ion masses can be used in correlative database-searching, like SEQUEST or Mascot, to identify peptides. The database-search method depends upon a score function that evaluates matches between the predicted ions and the ions observed in the tandem mass spectrum. Principally, peptide identification based on tandem MS and database-search algorithms does not take into account information about isotope distributions of the precursor ions. To determine the effectiveness of these search algorithms in terms of their ability to distinguish between correct and incorrect peptide assignments, we propose an additional metric that quantifies the similarity between the theoretical isotopic distribution for the precursor ions selected for tandem MS and the experimental mass spectra by using Pearson's χ(2) statistic. The observed association between Pearson's χ(2) statistic and the score function indicates that good scores can be obtained for molecules which exhibit atypical isotope profiles, while low scores can be obtained for fragment spectra which have a clear peptide-like isotope pattern. These results demonstrate that Pearson's χ(2) statistic can be used in conjunction with the score of database-search algorithms to increase the sensitivity and specificity of peptide identification. BIOLOGICAL SIGNIFICANCE In this manuscript, we present a workflow that provides a new perspective on the quality of peptide-to-spectrum matches (PSM) employed in database-searching strategies for peptide identification. Additional views on a dataset can facilitate a more hypothesis-driven interpretation of the mass spectrometry signals. The similarity metric on the PSM scores contemplates the isotopic profile and results in a measure that conveys a degree of biomolecular similarity observed from the precursor of the selected tandem MS spectra. A close agreement between the PSM score and the similarity metric will result in a higher confidence for the identification of the selected precursor ion.

[1]  A. Masselot,et al.  OLAV: Towards high‐throughput tandem mass spectrometry data identification , 2003, Proteomics.

[2]  William Stafford Noble,et al.  Rapid and accurate peptide identification from tandem mass spectra. , 2008, Journal of proteome research.

[3]  Piotr Dittwald,et al.  An Efficient Method to Calculate the Aggregated Isotopic Distribution and Exact Center-Masses , 2012, Journal of The American Society for Mass Spectrometry.

[4]  Lennart Martens,et al.  MS2PIP: a tool for MS/MS peak intensity prediction , 2013, Bioinform..

[5]  Mikhail M Savitski,et al.  Improving Protein Identification Using Complementary Fragmentation Techniques in Fourier Transform Mass Spectrometry* , 2005, Molecular & Cellular Proteomics.

[6]  Tomasz Burzykowski,et al.  The isotopic distribution conundrum. , 2012, Mass spectrometry reviews.

[7]  Tomasz Burzykowski,et al.  Using a Poisson approximation to predict the isotopic distribution of sulphur-containing peptides in a peptide-centric proteomic approach. , 2007, Rapid communications in mass spectrometry : RCM.

[8]  Tomasz Burzykowski,et al.  A model-based method for the prediction of the isotopic distribution of peptides , 2008, Journal of the American Society for Mass Spectrometry.

[9]  Lennart Martens,et al.  Machine learning applications in proteomics research: How the past can boost the future , 2014, Proteomics.

[10]  Alexey I Nesvizhskii,et al.  Protein identification by tandem mass spectrometry and sequence database searching. , 2007, Methods in molecular biology.

[11]  A. Gambin,et al.  BRAIN: a universal tool for high-throughput calculations of the isotopic distribution for mass spectrometry. , 2013, Analytical chemistry.

[12]  Hyungwon Choi,et al.  False discovery rates and related statistical concepts in mass spectrometry-based proteomics. , 2008, Journal of proteome research.

[13]  Guilong Cheng,et al.  Mass spectrometry of peptides and proteins. , 2005, Methods.

[14]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[15]  M. Senko,et al.  Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions , 1995, Journal of the American Society for Mass Spectrometry.

[16]  Zhongqi Zhang Prediction of low-energy collision-induced dissociation spectra of peptides. , 2004, Analytical chemistry.

[17]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[18]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[19]  T. Burzykowski,et al.  Comparison of the Mahalanobis Distance and Pearson’s χ2 Statistic as Measures of Similarity of Isotope Patterns , 2014, Journal of The American Society for Mass Spectrometry.

[20]  R. Zubarev,et al.  Calibration function for the orbitrap FTMS accounting for the space charge effect , 2010, Journal of the American Society for Mass Spectrometry.

[21]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[22]  A. Nesvizhskii,et al.  Experimental protein mixture for validating tandem mass spectral analysis. , 2002, Omics : a journal of integrative biology.

[23]  Ming Li,et al.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. , 2003, Rapid communications in mass spectrometry : RCM.

[24]  William Stafford Noble,et al.  Semi-supervised learning for peptide identification from shotgun proteomics datasets , 2007, Nature Methods.

[25]  F. Yates,et al.  Tests of Significance for 2 × 2 Contingency Tables , 1984 .

[26]  Magnus Palmblad,et al.  Automatic analysis of hydrogen/deuterium exchange mass spectra of peptides and proteins using calculations of isotopic distributions , 2001, Journal of the American Society for Mass Spectrometry.

[27]  Sándor Suhai,et al.  Fragmentation Pathways of Protonated Peptides , 2006 .

[28]  Michael J MacCoss,et al.  Comparison of database search strategies for high precursor mass accuracy MS/MS data. , 2010, Journal of proteome research.