Preview: a program for surveying shotgun proteomics tandem mass spectrometry data.

Database search programs for peptide identification by tandem mass spectrometry ask their users to set various parameters, including precursor and fragment mass tolerances, digestion specificity, and allowed types of modifications. Even proteomics experts with detailed knowledge of their samples may find it difficult to make these choices without significant investigation, and poor choices can lead to missed identifications and misleading results. Here we describe a program called Preview that analyzes a set of mass spectra for mass errors, digestion specificity, and known and unknown modifications, thereby facilitating parameter selection. Moreover, Preview optionally recalibrates mass over charge measurements, leading to further improvement in identification results. In a study of Bruton's tyrosine kinase, we find that the use of Preview improved the number of confidently identified mass spectra and phosphorylation sites by about 50%.

[1]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[2]  Eunok Paek,et al.  Prediction of novel modifications by unrestrictive search of tandem mass spectra. , 2009, Journal of proteome research.

[3]  P. Pevzner,et al.  InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. , 2005, Analytical chemistry.

[4]  P. Andrews,et al.  A spectral clustering approach to MS/MS identification of post-translational modifications. , 2008, Journal of proteome research.

[5]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[6]  Mikhail M Savitski,et al.  ModifiComb, a New Proteomic Tool for Mapping Substoichiometric Post-translational Modifications, Finding Novel Types of Modifications, and Fingerprinting Complex Protein Mixtures* , 2006, Molecular & Cellular Proteomics.

[7]  Jan Delabie,et al.  Chronic active B-cell-receptor signalling in diffuse large B-cell lymphoma , 2010, Nature.

[8]  K. Stühler,et al.  Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data , 2004, Proteomics.

[9]  D. Creasy,et al.  Error tolerant searching of uninterpreted tandem mass spectrometry data , 2002, Proteomics.

[10]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[11]  S. Carr,et al.  Reporting Protein Identification Data , 2006, Molecular & Cellular Proteomics.

[12]  R. Appel,et al.  Popitam: Towards new heuristic strategies to improve protein identification from tandem mass spectrometry data , 2003, Proteomics.

[13]  Peter R Baker,et al.  In-depth Analysis of Tandem Mass Spectrometry Data from Disparate Instrument Types*S , 2008, Molecular & Cellular Proteomics.

[14]  Florian Gnad,et al.  Large-scale Proteomics Analysis of the Human Kinome , 2009, Molecular & Cellular Proteomics.

[15]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[16]  Maureen Kachman,et al.  Validated MALDI-TOF/TOF mass spectra for protein standards , 2007, Journal of the American Society for Mass Spectrometry.

[17]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[18]  David Goldberg,et al.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. , 2007, Analytical chemistry.

[19]  E. Boja,et al.  Overalkylation of a protein digest with iodoacetamide. , 2001, Analytical chemistry.

[20]  A. Makarov,et al.  The Orbitrap: a new mass spectrometer. , 2005, Journal of mass spectrometry : JMS.

[21]  Marshall Bern,et al.  Conversion of methionine into homocysteic acid in heavily oxidized proteomics samples. , 2010, Rapid communications in mass spectrometry : RCM.

[22]  U. Walter,et al.  Phosphoproteome of resting human platelets. , 2008, Journal of proteome research.

[23]  Peter R. Baker,et al.  Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. , 1999, Analytical chemistry.

[24]  Pavel A. Pevzner,et al.  Protein identification by spectral networks analysis , 2007, Proceedings of the National Academy of Sciences.

[25]  Sean L Seymour,et al.  The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra*S , 2007, Molecular & Cellular Proteomics.

[26]  D. Creasy,et al.  Unimod: Protein modifications for mass spectrometry , 2004, Proteomics.

[27]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[28]  Sean L Seymour,et al.  Discovering known and unanticipated protein modifications using MS/MS database searching. , 2005, Analytical chemistry.

[29]  B. Cooper,et al.  Tandem mass spectrometry for the detection of plant pathogenic fungi and the effects of database composition on protein inferences , 2007, Proteomics.