RAId_deNovo: giving the score distribution of all possible peptides for statistical inference in peptide identifications

Summary: A major challenge in mass-spectrometry-based proteomics is the peptide identification statistics problem. As a tool designed to tackle this issue, RAId deNovo can generate, for a given tandem mass spectrum, the score distribution resulting from scoring all possible peptides under a certain class of scoring functions. This valuable information may aid development of a better measure for assigning statistical significance to the peptide candidates. Using a novel algorithm, RAId deNovo keeps track of the score distribution together with the associated peptide lengths for each score, providing proper score normalization. Availability: The webserver link is http://www.ncbi.nlm.nih.gov/ /CBBResearch/qmbp/raid denovo/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. Contact: yyu@ncbi.nlm.nih.gov

[1]  Lennart Martens,et al.  The minimum information about a proteomics experiment (MIAPE) , 2007, Nature Biotechnology.

[2]  B. Searle,et al.  Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies. , 2008, Journal of proteome research.

[3]  R. Agarwala,et al.  Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches , 2006, Nucleic acids research.

[4]  Yi-Kuo Yu,et al.  RAId_DbS: mass-spectrometry based peptide identification web server with knowledge integration , 2008, BMC Genomics.

[5]  Yi-Kuo Yu,et al.  RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics , 2007, Biology Direct.

[6]  R. Beavis,et al.  A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. , 2003, Analytical chemistry.

[7]  William Stafford Noble,et al.  Statistical calibration of the SEQUEST XCorr function. , 2009, Journal of proteome research.

[8]  Yi-Kuo Yu,et al.  Enhancing Peptide Identification Confidence by Combining Search Methods , 2008, Journal of proteome research.

[9]  Yi-Kuo Yu,et al.  Statistical Characterization of a 1D Random Potential Problem - with applications in score statistics of MS-based peptide sequencing. , 2008, Physica A.

[10]  Brendan MacLean,et al.  General framework for developing and evaluating database scoring algorithms using the TANDEM search engine , 2006, Bioinform..

[11]  M. MacCoss,et al.  A fast SEQUEST cross correlation algorithm. , 2008, Journal of proteome research.

[12]  P. Pevzner,et al.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. , 2008, Journal of proteome research.

[13]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[14]  Yi-Kuo Yu,et al.  Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics , 2005, Bioinform..

[15]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[16]  Yi-Kuo Yu,et al.  Calibrating E-values for MS2 database search methods , 2007, Biology Direct.

[17]  R. Aebersold,et al.  ProbID: A probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data , 2002, Proteomics.

[18]  Pavel A. Pevzner,et al.  De Novo Peptide Sequencing via Tandem Mass Spectrometry , 1999, J. Comput. Biol..

[19]  A. B. Robinson,et al.  Distribution of glutamine and asparagine residues and their near neighbors in peptides and proteins. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Yi-Kuo Yu,et al.  Ranked solutions to a class of combinatorial optimizations - with applications in mass spectrometry based peptide sequencing , 2005 .

[21]  Benno Schwikowski,et al.  Assessing Bias in Experiment Design for Large Scale Mass Spectrometry-based Quantitative Proteomics*S , 2007, Molecular & Cellular Proteomics.

[22]  A. Nesvizhskii,et al.  Experimental protein mixture for validating tandem mass spectral analysis. , 2002, Omics : a journal of integrative biology.

[23]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[24]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[25]  William Stafford Noble,et al.  Rapid and accurate peptide identification from tandem mass spectra. , 2008, Journal of proteome research.