Protein identification from tandem mass spectra by database searching.

Protein identification from tandem mass spectra is one of the most versatile and widely used proteomics workflows, able to identify proteins, characterize post-translational modifications, and provide semiquantitative measurements of relative protein abundance. This manuscript describes the concepts, prerequisites, and methods required to analyze a tandem mass spectrometry dataset in order to identify its proteins, by using a tandem mass spectrometry search engine to search protein sequence databases. The discussion includes instructions for extraction, preparation, and formatting of spectral datafiles, selection of appropriate search parameter settings, and basic interpretation of the results.

[1]  J. Buhmann,et al.  Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry* , 2009, Molecular & Cellular Proteomics.

[2]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[3]  P. Pevzner,et al.  InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. , 2005, Analytical chemistry.

[4]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[5]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[6]  Xue Wu,et al.  An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra , 2009, Clinical Proteomics.

[7]  S. Mohammed,et al.  Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap Velos. , 2011, Journal of proteome research.

[8]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[9]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[10]  Vineet Bafna,et al.  On de novo interpretation of tandem mass spectra for peptide identification , 2003, RECOMB '03.

[11]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[12]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[13]  J. A. Taylor,et al.  Informatics for protein identification by mass spectrometry. , 2005, Methods.

[14]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[15]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[16]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[17]  Rovshan G Sadygov,et al.  Large-scale database searching using tandem mass spectra: Looking up the answer in the back of the book , 2004, Nature Methods.

[18]  Michael J MacCoss,et al.  Computational analysis of shotgun proteomics data. , 2005, Current opinion in chemical biology.

[19]  Brendan MacLean,et al.  General framework for developing and evaluating database scoring algorithms using the TANDEM search engine , 2006, Bioinform..

[20]  John R Yates,et al.  Shotgun proteomics: integrating technologies to answer biological questions. , 2003, Current opinion in molecular therapeutics.

[21]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[22]  Ming-Yang Kao,et al.  A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry , 2000, SODA '00.

[23]  C. Dass Principles and Practice of Biological Mass Spectrometry , 2000 .

[24]  David L Tabb,et al.  DirecTag: accurate sequence tags from peptide MS/MS through statistical scoring. , 2008, Journal of proteome research.

[25]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[26]  Alexey I Nesvizhskii,et al.  Protein identification by tandem mass spectrometry and sequence database searching. , 2007, Methods in molecular biology.

[27]  McDonald Wh,et al.  Shotgun proteomics: integrating technologies to answer biological questions. , 2003, Current opinion in molecular therapeutics.

[28]  D. Tabb,et al.  MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. , 2007, Journal of proteome research.

[29]  N. Edwards,et al.  Novel peptide identification from tandem mass spectra using ESTs and sequence database compression , 2007, Molecular systems biology.

[30]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[31]  Henry H. N. Lam,et al.  Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. , 2008, Physiological genomics.