PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification*

Many software tools have been developed for the automated identification of peptides from tandem mass spectra. The accuracy and sensitivity of the identification software via database search are critical for successful proteomics experiments. A new database search tool, PEAKS DB, has been developed by incorporating the de novo sequencing results into the database search. PEAKS DB achieves significantly improved accuracy and sensitivity over two other commonly used software packages. Additionally, a new result validation method, decoy fusion, has been introduced to solve the issue of overconfidence that exists in the conventional target decoy method for certain types of peptide identification software.

[1]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[2]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[3]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[4]  P. Pevzner,et al.  The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search* , 2010, Molecular & Cellular Proteomics.

[5]  Erika Hernandez,et al.  Mind your Pʼs and Qʼs , 2009 .

[6]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[7]  Gilbert S Omenn,et al.  An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and specificity analysis , 2005, Proteomics.

[8]  Hao Chi,et al.  Improved peptide identification for proteomic analysis based on comprehensive characterization of electron transfer dissociation spectra. , 2010, Journal of proteome research.

[9]  William Stafford Noble,et al.  Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. , 2008, Journal of proteome research.

[10]  Robert E. Kearney,et al.  A HUPO test sample study reveals common problems in mass spectrometry-based proteomics , 2009, Nature Methods.

[11]  Rex Fernando,et al.  SNPlotz: a generic genome plot tool to aid the SNP association studies , 2010, BMC Bioinformatics.

[12]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[13]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[14]  Stephen R Master,et al.  Unbiased statistical analysis for multi-stage proteomic search strategies. , 2010, Journal of proteome research.

[15]  David Goldberg,et al.  Reanalysis of Tyrannosaurus rex Mass Spectra. , 2009, Journal of proteome research.

[16]  Bin Ma,et al.  Better score function for peptide identification with ETD MS/MS spectra , 2010, BMC Bioinformatics.

[17]  Joachim M. Buhmann,et al.  A Hidden Markov Model for de Novo Peptide Sequencing , 2004, NIPS.

[18]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[19]  Harkamal Walia,et al.  Protein abundances are more conserved than mRNA abundances across diverse taxa , 2010, Proteomics.

[20]  Lennart Martens,et al.  iPRG 2011: A Study on the Identification of Electron Transfer Dissociation (ETD) Mass Spectra , 2011 .

[21]  Ming Li,et al.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. , 2003, Rapid communications in mass spectrometry : RCM.

[22]  Lan Huang,et al.  Comprehensive Analysis of a Multidimensional Liquid Chromatography Mass Spectrometry Dataset Acquired on a Quadrupole Selecting, Quadrupole Collision Cell, Time-of-flight Mass Spectrometer , 2005, Molecular & Cellular Proteomics.

[23]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[24]  Markus Brosch,et al.  Accurate and sensitive peptide identification with Mascot Percolator. , 2009, Journal of proteome research.

[25]  Yong J. Kil,et al.  Comment on "Unbiased statistical analysis for multi-stage proteomic search strategies". , 2011, Journal of proteome research.

[26]  M. Mann,et al.  Andromeda: a peptide search engine integrated into the MaxQuant environment. , 2011, Journal of proteome research.

[27]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[28]  Peter R. Baker,et al.  Comprehensive Analysis of a Multidimensional Liquid Chromatography Mass Spectrometry Dataset Acquired on a Quadrupole Selecting, Quadrupole Collision Cell, Time-of-flight Mass Spectrometer , 2005, Molecular & Cellular Proteomics.