Neural Network-Based Method for Peptide Identification in Proteomics

Protein identification in biological samples is one of the main objectives of proteomics. In proteomic experiments proteins are first digested into short peptides, which are next analyzed using tandem mass spectrometry and identified by database search algorithms. In this study a novel neural network-based method for peptide identification is proposed. The presented method improves the identification efficiency by the incorporation of additoinal peptide-specific features and scores from multiple database search algorithms. Moreover, the method for filtering out low quality mass spectra prior to database search in order to reduce the overall computational time of the identification process is presented.

[1]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[2]  R. Aebersold,et al.  Dynamic Spectrum Quality Assessment and Iterative Computational Analysis of Shotgun Proteomic Data , 2006, Molecular & Cellular Proteomics.

[3]  S. Bryant,et al.  Assessing data quality of peptide mass spectra obtained by quadrupole ion trap mass spectrometry. , 2005, Journal of proteome research.

[4]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[6]  E. Kolker,et al.  Spectral quality assessment for high-throughput tandem mass spectrometry proteomics. , 2004, Omics : a journal of integrative biology.

[7]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[8]  M. Mikuła,et al.  Comprehensive Analysis of the Palindromic Motif TCTCGCGAGA: A Regulatory Element of the HNRNPK Promoter , 2010, DNA research : an international journal for rapid publication of reports on genes and genomes.

[9]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[10]  Jie Ma,et al.  Bayesian Nonparametric Model for the Validation of Peptide Identification in Shotgun Proteomics*S , 2009, Molecular & Cellular Proteomics.

[11]  Terrence L. Fine,et al.  Feedforward Neural Network Methodology , 1999, Information Science and Statistics.

[12]  Marshall W. Bern,et al.  Automatic Quality Assessment of Peptide Tandem Mass Spectra , 2004, ISMB/ECCB.

[13]  L. Raczyński,et al.  Proteins and peptides identification from MS/MS data in proteomics , 2010 .

[14]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[15]  William Stafford Noble,et al.  Semi-supervised learning for peptide identification from shotgun proteomics datasets , 2007, Nature Methods.

[16]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[17]  M. Mann,et al.  Analysis of proteins and proteomes by mass spectrometry. , 2001, Annual review of biochemistry.