Advancement in Protein Inference from Shotgun Proteomics Using Peptide Detectability

A major challenge in shotgun proteomics has been the assignment of identified peptides to the proteins from which they originate, referred to as the protein inference problem. Redundant and homologous protein sequences present a challenge in being correctly identified, as a set of peptides may in many cases represent multiple proteins. One simple solution to this problem is the assignment of the smallest number of proteins that explains the identified peptides. However, it is not certain that a natural system should be accurately represented using this minimalist approach. In this paper, we propose a reformulation of the protein inference problem by utilizing the recently introduced concept of peptide detectability. We also propose a heuristic algorithm to solve this problem and evaluate its performance on synthetic and real proteomics data. In comparison to a greedy implementation of the minimum protein set algorithm, our solution that incorporates peptide detectability performs favorably.

[1]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[2]  Mark S Friedrichs,et al.  Changes in the protein expression of yeast as a function of carbon source. , 2003, Journal of proteome research.

[3]  James P. Reilly,et al.  A computational approach toward label-free protein quantification using predicted peptide detectability , 2006, ISMB.

[4]  M. Mann,et al.  What does it mean to identify a protein in proteomics? , 2002, Trends in biochemical sciences.

[5]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[6]  Lewis Y. Geer,et al.  DBParser: web-based software for shotgun proteomic data analyses. , 2004, Journal of proteome research.

[7]  K. Resing,et al.  Improving reproducibility and sensitivity in identifying human proteins by shotgun proteomics. , 2004, Analytical chemistry.

[8]  Andrew Emili,et al.  Going global: protein expression profiling using shotgun mass spectrometry. , 2003, Current opinion in molecular therapeutics.

[9]  John R Yates,et al.  Shotgun proteomics: integrating technologies to answer biological questions. , 2003, Current opinion in molecular therapeutics.

[10]  J. Yates,et al.  Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. , 1995, Analytical chemistry.

[11]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[12]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[13]  McDonald Wh,et al.  Shotgun proteomics: integrating technologies to answer biological questions. , 2003, Current opinion in molecular therapeutics.

[14]  E. Kolker,et al.  Standard mixtures for proteome studies. , 2004, Omics : a journal of integrative biology.