Development and assessment of scoring functions for protein identification using PMF data

PMF is one of the major methods for protein identification using the MS technology. It is faster and cheaper than MS/MS. Although PMF does not differentiate trypsin‐digested peptides of identical mass, which makes it less informative than MS/MS, current computational methods for PMF have the potential to improve its detection accuracy by better use of the information content in PMF spectra. We developed a number of new probability‐based scoring functions for PMF protein identification based on the MOWSE algorithm. We considered a detailed distribution of matching masses in a protein database and peak intensity, as well as the likelihood of peptide matches to be close to each other in a protein sequence. Our computational methods are assessed and compared with other methods using PMF data of 52 gel spots of known protein standards. The comparison shows that our new scoring schemes have higher or comparable accuracies for protein identification in comparison to the existing methods. Our software is freely available upon request. The scoring functions can be easily incorporated into other proteomics software packages.

[1]  David Fenyö,et al.  Protein identification in complex mixtures. , 2005, Journal of proteome research.

[2]  David Fenyö,et al.  Probity: a protein identification algorithm with accurate assignment of the statistical significance of the results. , 2004, Journal of proteome research.

[3]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[4]  A. Podtelejnikov,et al.  Identification of the components of simple protein mixtures by high-accuracy peptide mass mapping and database searching. , 1997, Analytical chemistry.

[5]  D. Hochstrasser,et al.  Peptide mass fingerprinting peak intensity prediction: Extracting knowledge from spectra , 2002, Proteomics.

[6]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[7]  P. Højrup,et al.  VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins. , 2005, Journal of proteome research.

[8]  K. Gevaert,et al.  Protein identification methods in proteomics , 2000, Electrophoresis.

[9]  Gary Stacey,et al.  Proteomic analysis of soybean root hairs after infection by Bradyrhizobium japonicum. , 2005, Molecular plant-microbe interactions : MPMI.

[10]  K. Parker Scoring methods in MALDI peptide mass fingerprinting: ChemScore, and the ChemApplex program , 2002, Journal of the American Society for Mass Spectrometry.

[11]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[12]  Peter R. Baker,et al.  Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. , 1999, Analytical chemistry.

[13]  Christoph Menzel,et al.  OLAV-PMF: a novel scoring scheme for high-throughput peptide mass fingerprinting. , 2004, Journal of proteome research.

[14]  B. Chait,et al.  ProFound: an expert system for protein identification using mass spectrometric peptide mapping information. , 2000, Analytical chemistry.

[15]  J. Crowley Introduction to proteomics: Tools for the new biology , 2002 .

[16]  Jimmy K. Eng,et al.  Tutorial review. Future prospects for the analysis of complex biological systems using micro-column liquid chromatography–electrospray tandem mass spectrometry , 1996 .

[17]  Lennart Kenne,et al.  Method for differential detection and identification of components in protein mixtures analyzed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. , 2004, Rapid communications in mass spectrometry : RCM.

[18]  David Fenyö,et al.  Optimizing search conditions for the mass fingerprint‐based identification of proteins , 2006, Proteomics.

[19]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.