PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification

Abstractsequencing is one of the most promising proteomics techniques for identification of protein post-translation modifications (PTMs) in studying protein regulations and functions. We have developed a computer tool PRIME for identification of b and y ions in tandem mass spectra, a key challenging problem in de novo sequencing. PRIME utilizes a feature that ions of the same and different types follow different mass-difference distributions to separate b from y ions correctly. We have formulated the problem as a graph partition problem. A linear integer-programming algorithm has been implemented to solve the graph partition problem rigorously and efficiently. The performance of PRIME has been demonstrated on a large amount of simulated tandem mass spectra derived from Yeast genome and its power of detecting PTMs has been tested on 216 simulated phosphopeptides.

[1]  G Padron,et al.  Automated interpretation of high-energy collision-induced dissociation spectra of singly protonated peptides by 'SeqMS', a software aid for de novo sequencing by tandem mass spectrometry. , 1998, Rapid communications in mass spectrometry : RCM.

[2]  R. Lougee-Heimer,et al.  The Common Optimization INterface for Operations Research: Promoting open-source software in the operations research community , 2003 .

[3]  M. Mann,et al.  Proteomic analysis of post-translational modifications , 2003, Nature Biotechnology.

[4]  Hanno Steen,et al.  Analysis of protein phosphorylation using mass spectrometry: deciphering the phosphoproteome. , 2002, Trends in biotechnology.

[5]  Ming-Yang Kao,et al.  A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry , 2000, SODA '00.

[6]  Pavel A. Pevzner,et al.  Mutation-Tolerant Protein Identification by Mass Spectrometry , 2000, J. Comput. Biol..

[7]  Ting Chen,et al.  A Suboptimal Algorithm for De Novo Peptide Sequencing via Tandem Mass Spectrometry , 2003, J. Comput. Biol..

[8]  Nicolle H. Packer,et al.  The Importance of Protein Co- and Post-Translational Modifications in Proteome Projects , 1997 .

[9]  J. Yates,et al.  Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic peptides. , 2003, Analytical chemistry.

[10]  Ming Li,et al.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. , 2003, Rapid communications in mass spectrometry : RCM.

[11]  John I. Clark,et al.  Shotgun identification of protein modifications from protein complexes and lens tissue , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[13]  O. Jensen Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. , 2004, Current opinion in chemical biology.

[14]  J. Shabanowitz,et al.  Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae , 2002, Nature Biotechnology.

[15]  Matthew J. Saltzman,et al.  Parallel branch, cut, and price for large-scale discrete optimization , 2003, Math. Program..

[16]  Pavel A. Pevzner,et al.  De Novo Peptide Sequencing via Tandem Mass Spectrometry , 1999, J. Comput. Biol..

[17]  C. Bartels Fast algorithm for peptide sequencing by mass spectroscopy. , 1990, Biomedical & environmental mass spectrometry.

[18]  J. A. Taylor,et al.  Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. , 2001, Analytical chemistry.

[19]  Bo Yan,et al.  A graph-theoretic approach for the separation of b and y ions in tandem mass spectra , 2005, Bioinform..

[20]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[21]  Ying Xu,et al.  Separation of ion types in tandem mass spectrometry data interpretation - a graph-theoretic approach , 2004 .