Protein identification by spectral networks analysis.

While advances in tandem mass spectrometry (MS/MS) steadily increase the rate of generation of MS/MS spectra, standard algorithmic approaches for peptide identification recently seemed to be reaching the limit on the amount of information that could be extracted from MS/MS spectra. However, a closer look reveals that a common limiting procedure is to analyze each spectrum in isolation, even though high throughput mass spectrometry regularly generates many spectra from related peptides. By capitalizing on this redundancy we show that, similarly to the alignment of protein sequences, unidentified MS/MS spectra can also be aligned for the identification of modified and unmodified variants of the same peptide. Moreover, this alignment procedure can be iterated for the accurate grouping of multiple modification variants of the same peptides. Furthermore, the combination of shotgun proteomics with the alignment of spectra from overlapping peptides led to the development of Shotgun Protein Sequencing - similarly to the assembly of DNA reads into whole genomic sequences, we show that assembly of MS/MS spectra enables the highest ever de novo sequencing accuracy, while recovering nearly complete protein sequences. We further show that shotgun protein sequencing has the potential to overcome the limitations of -current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.

[1]  John I. Clark,et al.  Shotgun identification of protein modifications from protein complexes and lens tissue , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J. Yates,et al.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. , 2003, Analytical chemistry.

[3]  B. Olivera,et al.  Amino acid sequence and biological activity of a γ-conotoxin-like peptide from the worm-hunting snail Conus austini , 2006, Peptides.

[4]  Aaron A. Klammer,et al.  Effects of modified digestion schemes on the identification of proteins from complex mixtures. , 2006, Journal of proteome research.

[5]  Virgil L. Woods,et al.  Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Ulrich Kuch,et al.  Complete amino acid sequence and phylogenetic analysis of a long-chain neurotoxin from the venom of the African banded water cobra, Boulengerina annulata. , 2004, Toxicon : official journal of the International Society on Toxinology.

[7]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[8]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[9]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[10]  Kenneth J. Hillan,et al.  Discovery and development of bevacizumab, an anti-VEGF antibody for treating cancer , 2004, Nature Reviews Drug Discovery.

[11]  Mikhail M Savitski,et al.  ModifiComb, a New Proteomic Tool for Mapping Substoichiometric Post-translational Modifications, Finding Novel Types of Modifications, and Fingerprinting Complex Protein Mixtures* , 2006, Molecular & Cellular Proteomics.

[12]  J. Yates,et al.  Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. , 1995, Analytical chemistry.

[13]  K. Biemann,et al.  Determination of the amino acid sequence in oligopeptides by computer interpretation of their high-resolution mass spectra. , 1966, Journal of the American Chemical Society.

[14]  Chris L. Tang,et al.  Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. , 2001, Genome research.

[15]  M. Savitski,et al.  Proteomics-grade de novo sequencing approach. , 2005, Journal of proteome research.

[16]  Alexey I Nesvizhskii,et al.  Interpretation of Shotgun Proteomic Data , 2005, Molecular & Cellular Proteomics.

[17]  John R Yates,et al.  Mass spectrometry as an emerging tool for systems biology. , 2004, BioTechniques.

[18]  J. Joseph,et al.  Snake venom prothrombin activators similar to blood coagulation factor Xa. , 2004, Current drug targets. Cardiovascular & haematological disorders.

[19]  P. Gearhart,et al.  Immunology: The roots of antibody diversity , 2002, Nature.

[20]  A. Shevchenko,et al.  Rapid 'de novo' peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. , 1997, Rapid communications in mass spectrometry : RCM.

[21]  C. Watanabe,et al.  Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Mikhail M Savitski,et al.  New Data Base-independent, Sequence Tag-based Scoring of Peptide MS/MS Data Validates Mowse Scores, Recovers Below Threshold Data, Singles Out Modified Peptides, and Assesses the Quality of MS/MS Techniques* , 2005, Molecular & Cellular Proteomics.

[23]  Q. Zhou,et al.  Molecular cloning and expression of catrocollastatin, a snake-venom protein from Crotalus atrox (western diamondback rattlesnake) which inhibits platelet adhesion to collagen. , 1995, The Biochemical journal.

[24]  John S Haurum,et al.  Recombinant polyclonal antibodies: the next generation of antibody therapeutics? , 2006, Drug discovery today.

[25]  Nuno Bandeira,et al.  Shotgun Protein Sequencing , 2007, Molecular & Cellular Proteomics.

[26]  P. Pevzner,et al.  InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. , 2005, Analytical chemistry.

[27]  Ilan Beer,et al.  Improving large‐scale proteomics by clustering of mass spectrometry data , 2004, Proteomics.

[28]  P. Pevzner,et al.  Shotgun protein sequencing by tandem mass spectra assembly. , 2004, Analytical chemistry.

[29]  R. Lewis,et al.  Therapeutic potential of venom peptides , 2003, Nature Reviews Drug Discovery.

[30]  P A Pevzner,et al.  Age-related changes in human crystallins determined from comparative analysis of post-translational modifications in young and aged lens: does deamidation contribute to crystallin insolubility? , 2006, Journal of proteome research.

[31]  R S Johnson,et al.  The primary structure of thioredoxin from Chromatium vinosum determined by high-performance tandem mass spectrometry. , 1987, Biochemistry.

[32]  P. Pevzner,et al.  Automated de novo protein sequencing of monoclonal antibodies , 2008, Nature Biotechnology.

[33]  Pavel A. Pevzner,et al.  Mutation-Tolerant Protein Identification by Mass Spectrometry , 2000, J. Comput. Biol..

[34]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[35]  P. Pevzner,et al.  De novo peptide sequencing and identification with precision mass spectrometry. , 2007, Journal of proteome research.

[36]  Bin Ma,et al.  SPIDER: software for protein identification from sequence tags with de novo sequencing error , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[37]  Pavel A. Pevzner,et al.  Protein identification by spectral networks analysis , 2007, Proceedings of the National Academy of Sciences.

[38]  Jennie R Lill,et al.  De novo proteomic sequencing of a monoclonal antibody raised against OX40 ligand. , 2006, Analytical biochemistry.

[39]  Adriano M C Pimenta,et al.  Small peptides, big world: biotechnological potential in neglected bioactive peptides from arthropod venoms , 2005, Journal of peptide science : an official publication of the European Peptide Society.

[40]  Janice M. Reichert,et al.  Development trends for monoclonal antibody cancer therapeutics , 2007, Nature Reviews Drug Discovery.

[41]  Dekel Tsur,et al.  A New Approach to Protein Identification , 2006, RECOMB.