D‐score: A search engine independent MD‐score

While peptides carrying PTMs are routinely identified in gel‐free MS, the localization of the PTMs onto the peptide sequences remains challenging. Search engine scores of secondary peptide matches have been used in different approaches in order to infer the quality of site inference, by penalizing the localization whenever the search engine similarly scored two candidate peptides with different site assignments. In the present work, we show how the estimation of posterior error probabilities for peptide candidates allows the estimation of a PTM score called the D‐score, for multiple search engine studies. We demonstrate the applicability of this score to three popular search engines: Mascot, OMSSA, and X!Tandem, and evaluate its performance using an already published high resolution data set of synthetic phosphopeptides. For those peptides with phosphorylation site inference uncertainty, the number of spectrum matches with correctly localized phosphorylation increased by up to 25.7% when compared to using Mascot alone, although the actual increase depended on the fragmentation method used. Since this method relies only on search engine scores, it can be readily applied to the scoring of the localization of virtually any modification at no additional experimental or in silico cost.

[1]  Lennart Martens,et al.  The first comprehensive and quantitative analysis of human platelet protein composition allows the comparative analysis of structural and functional pathways. , 2012, Blood.

[2]  Lennart Martens,et al.  Analysis of the resolution limitations of peptide identification algorithms. , 2011, Journal of proteome research.

[3]  Steven P Gygi,et al.  A probability-based approach for high-throughput protein phosphorylation analysis and site localization , 2006, Nature Biotechnology.

[4]  Lennart Martens,et al.  SearchGUI: An open‐source graphical user interface for simultaneous OMSSA and X!Tandem searches , 2011, Proteomics.

[5]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[6]  T. Köcher,et al.  Universal and confident phosphorylation site localization using phosphoRS. , 2011, Journal of proteome research.

[7]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[8]  Lennart Martens,et al.  A complex standard for protein identification, designed by evolution. , 2012, Journal of proteome research.

[9]  Lennart Martens,et al.  XTandem Parser: An open‐source library to parse and analyse X!Tandem MS/MS search results , 2010, Proteomics.

[10]  Lennart Martens,et al.  OMSSA Parser: An open‐source library to parse and extract data from OMSSA MS/MS search results , 2009, Proteomics.

[11]  Albert Sickmann,et al.  The good, the bad, the ugly: Validating the mass spectrometric analysis of modified peptides , 2011, Proteomics.

[12]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[13]  William Stafford Noble,et al.  On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics. , 2011, Journal of proteome research.

[14]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[15]  Ruedi Aebersold,et al.  The standard protein mix database: a diverse data set to assist in the production of improved Peptide and protein identification software tools. , 2008, Journal of proteome research.

[16]  Stephen R Master,et al.  Unbiased statistical analysis for multi-stage proteomic search strategies. , 2010, Journal of proteome research.

[17]  M. Mann,et al.  Global, In Vivo, and Site-Specific Phosphorylation Dynamics in Signaling Networks , 2006, Cell.

[18]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[19]  K. Clauser,et al.  Modification Site Localization Scoring: Strategies and Performance , 2012, Molecular & Cellular Proteomics.

[20]  B. Kuster,et al.  Confident Phosphorylation Site Localization Using the Mascot Delta Score , 2010, Molecular & Cellular Proteomics.

[21]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[22]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[23]  John R Yates,et al.  Can the false‐discovery rate be misleading? , 2011, Proteomics.

[24]  Lennart Martens,et al.  MascotDatfile: An open‐source library to fully parse and analyse MASCOT MS/MS search results , 2007, Proteomics.

[25]  J. Yates,et al.  Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. , 1995, Analytical chemistry.

[26]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.