Protein identification by spectral networks analysis

Advances in tandem mass spectrometry (MS/MS) steadily increase the rate of generation of MS/MS spectra. As a result, the existing approaches that compare spectra against databases are already facing a bottleneck, particularly when interpreting spectra of modified peptides. Here we explore a concept that allows one to perform an MS/MS database search without ever comparing a spectrum against a database. We propose to take advantage of spectral pairs, which are pairs of spectra obtained from overlapping (often nontryptic) peptides or from unmodified and modified versions of the same peptide. Having a spectrum of a modified peptide paired with a spectrum of an unmodified peptide allows one to separate the prefix and suffix ladders, to greatly reduce the number of noise peaks, and to generate a small number of peptide reconstructions that are likely to contain the correct one. The MS/MS database search is thus reduced to extremely fast pattern-matching (rather than time-consuming matching of spectra against databases). In addition to speed, our approach provides a unique paradigm for identifying posttranslational modifications by means of spectral networks analysis.

[1]  Remmet Jonges,et al.  Tryptophan deficiency arrests chromatin breakdown in secondary lens fibers of rats. , 2004, Experimental eye research.

[2]  Pavel A. Pevzner,et al.  De Novo Peptide Sequencing via Tandem Mass Spectrometry , 1999, J. Comput. Biol..

[3]  Adriano M C Pimenta,et al.  Small peptides, big world: biotechnological potential in neglected bioactive peptides from arthropod venoms , 2005, Journal of peptide science : an official publication of the European Peptide Society.

[4]  P. Pevzner,et al.  Automated de novo protein sequencing of monoclonal antibodies , 2008, Nature Biotechnology.

[5]  Mikhail M Savitski,et al.  ModifiComb, a New Proteomic Tool for Mapping Substoichiometric Post-translational Modifications, Finding Novel Types of Modifications, and Fingerprinting Complex Protein Mixtures* , 2006, Molecular & Cellular Proteomics.

[6]  A. Shevchenko,et al.  Rapid 'de novo' peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. , 1997, Rapid communications in mass spectrometry : RCM.

[7]  R. Sherwin,et al.  Intravenous liposomal delivery of the snake venom disintegrin contortrostatin limits breast cancer progression. , 2004, Molecular cancer therapeutics.

[8]  M. Savitski,et al.  Proteomics-grade de novo sequencing approach. , 2005, Journal of proteome research.

[9]  Janice M. Reichert,et al.  Development trends for monoclonal antibody cancer therapeutics , 2007, Nature Reviews Drug Discovery.

[10]  Sean L Seymour,et al.  Discovering known and unanticipated protein modifications using MS/MS database searching. , 2005, Analytical chemistry.

[11]  Bin Ma,et al.  SPIDER: software for protein identification from sequence tags with de novo sequencing error , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[12]  Q. Zhou,et al.  Molecular cloning and expression of catrocollastatin, a snake-venom protein from Crotalus atrox (western diamondback rattlesnake) which inhibits platelet adhesion to collagen. , 1995, The Biochemical journal.

[13]  D. Creasy,et al.  Unimod: Protein modifications for mass spectrometry , 2004, Proteomics.

[14]  John I. Clark,et al.  Shotgun identification of protein modifications from protein complexes and lens tissue , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J. Joseph,et al.  Procoagulant Proteins from Snake Venoms , 2001, Pathophysiology of Haemostasis and Thrombosis.

[16]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[17]  B. Olivera,et al.  Amino acid sequence and biological activity of a γ-conotoxin-like peptide from the worm-hunting snail Conus austini , 2006, Peptides.

[18]  P. Pevzner,et al.  De novo peptide sequencing and identification with precision mass spectrometry. , 2007, Journal of proteome research.

[19]  D. Liebler,et al.  P-Mod: an algorithm and software to map modifications to peptide sequences using tandem MS data. , 2005, Journal of proteome research.

[20]  Katalin F. Medzihradszky,et al.  Factors that contribute to the complexity of protein digests , 2004 .

[21]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[22]  P. Pevzner,et al.  InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. , 2005, Analytical chemistry.

[23]  Virgil L. Woods,et al.  Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[24]  R. Lewis,et al.  Therapeutic potential of venom peptides , 2003, Nature Reviews Drug Discovery.

[25]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[26]  J. Yates,et al.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. , 2003, Analytical chemistry.

[27]  Mikhail M Savitski,et al.  New Data Base-independent, Sequence Tag-based Scoring of Peptide MS/MS Data Validates Mowse Scores, Recovers Below Threshold Data, Singles Out Modified Peptides, and Assesses the Quality of MS/MS Techniques* , 2005, Molecular & Cellular Proteomics.

[28]  J. Yates,et al.  Probability-based validation of protein identifications using a modified SEQUEST algorithm. , 2002, Analytical chemistry.

[29]  Dekel Tsur,et al.  A New Approach to Protein Identification , 2006, RECOMB.

[30]  John S Haurum,et al.  Recombinant polyclonal antibodies: the next generation of antibody therapeutics? , 2006, Drug discovery today.

[31]  大房 健 基礎講座 電気泳動(Electrophoresis) , 2005 .

[32]  A. Nesvizhskii,et al.  Experimental protein mixture for validating tandem mass spectral analysis. , 2002, Omics : a journal of integrative biology.

[33]  Joachim M. Buhmann,et al.  A Hidden Markov Model for de Novo Peptide Sequencing , 2004, NIPS.

[34]  M. Savitski,et al.  Extent of Modifications in Human Proteome Samples and Their Effect on Dynamic Range of Analysis in Shotgun Proteomics*S , 2006, Molecular & Cellular Proteomics.

[35]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[36]  J. Yates,et al.  Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. , 1995, Analytical chemistry.

[37]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[38]  J. Joseph,et al.  Snake venom prothrombin activators similar to blood coagulation factor Xa. , 2004, Current drug targets. Cardiovascular & haematological disorders.

[39]  K. Biemann,et al.  Determination of the amino acid sequence in oligopeptides by computer interpretation of their high-resolution mass spectra. , 1966, Journal of the American Chemical Society.

[40]  Chris L. Tang,et al.  Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. , 2001, Genome research.

[41]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[42]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[43]  Nuno Bandeira,et al.  Shotgun Protein Sequencing of Post-translationally Modified Snake Venom Proteins , 2006 .

[44]  John R Yates,et al.  Mass spectrometry as an emerging tool for systems biology. , 2004, BioTechniques.

[45]  Pavel A. Pevzner,et al.  Mutation-Tolerant Protein Identification by Mass Spectrometry , 2000, J. Comput. Biol..

[46]  Z. Zhang,et al.  De novo peptide sequencing by two-dimensional fragment correlation mass spectrometry. , 2000, Analytical chemistry.

[47]  B. Searle,et al.  Identification of protein modifications using MS/MS de novo sequencing and the OpenSea alignment algorithm. , 2005, Journal of proteome research.

[48]  P. Gearhart,et al.  Immunology: The roots of antibody diversity , 2002, Nature.

[49]  Alexey I Nesvizhskii,et al.  Interpretation of Shotgun Proteomic Data , 2005, Molecular & Cellular Proteomics.

[50]  Richard Sposto,et al.  A Novel Snake Venom Disintegrin That Inhibits Human Ovarian Cancer Dissemination and Angiogenesis in an Orthotopic Nude Mouse Model , 2001, Pathophysiology of Haemostasis and Thrombosis.

[51]  R S Johnson,et al.  The primary structure of thioredoxin from Chromatium vinosum determined by high-performance tandem mass spectrometry. , 1987, Biochemistry.

[52]  Ming-Yang Kao,et al.  A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry , 2000, SODA '00.

[53]  Kenneth J. Hillan,et al.  Discovery and development of bevacizumab, an anti-VEGF antibody for treating cancer , 2004, Nature Reviews Drug Discovery.

[54]  Aaron A. Klammer,et al.  Effects of modified digestion schemes on the identification of proteins from complex mixtures. , 2006, Journal of proteome research.

[55]  Vineet Bafna,et al.  InsPecT : Fast and accurate identification of post-translationally modified peptides from tandem mass spectra , 2005 .

[56]  V. N. Lapko,et al.  Identification of an artifact in the mass spectrometry of proteins derivatized with iodoacetamide. , 2000, Journal of mass spectrometry : JMS.

[57]  Ilan Beer,et al.  Improving large‐scale proteomics by clustering of mass spectrometry data , 2004, Proteomics.

[58]  D. Liebler,et al.  Peptide sequence motif analysis of tandem MS data with the SALSA algorithm. , 2002, Analytical chemistry.

[59]  P A Pevzner,et al.  Age-related changes in human crystallins determined from comparative analysis of post-translational modifications in young and aged lens: does deamidation contribute to crystallin insolubility? , 2006, Journal of proteome research.

[60]  Mitsuhiro Fukao,et al.  Age-related nuclear cataract and indoleamine 2,3-dioxygenase-initiated tryptophan metabolism in the human lens. , 2003, Advances in experimental medicine and biology.

[61]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[62]  C. Watanabe,et al.  Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[64]  P. Bork,et al.  Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. , 2001, Analytical chemistry.

[65]  Ulrich Kuch,et al.  Complete amino acid sequence and phylogenetic analysis of a long-chain neurotoxin from the venom of the African banded water cobra, Boulengerina annulata. , 2004, Toxicon : official journal of the International Society on Toxinology.

[66]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[67]  A. Gomes,et al.  Snake venom as therapeutic agents: from toxin to drug development. , 2002, Indian journal of experimental biology.

[68]  K. Lampi,et al.  The Sequence of Human B1-Crystallin cDNA Allows Mass Spectrometric Detection of B1 Protein Missing Portions of Its N-terminal Extension (*) , 1996, The Journal of Biological Chemistry.

[69]  Jennie R Lill,et al.  De novo proteomic sequencing of a monoclonal antibody raised against OX40 ligand. , 2006, Analytical biochemistry.

[70]  P. Pevzner,et al.  Shotgun protein sequencing by tandem mass spectra assembly. , 2004, Analytical chemistry.

[71]  Nuno Bandeira,et al.  Shotgun Protein Sequencing , 2007, Molecular & Cellular Proteomics.

[72]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.