A review of methods for interpretation of glycopeptide tandem mass spectral data

Despite the publication of several software tools for analysis of glycopeptide tandem mass spectra, there remains a lack of consensus regarding the most effective and appropriate methods. In part, this reflects problems with applying standard methods for proteomics database searching and false discovery rate calculation. While the analysis of small post-translational modifications (PTMs) may be regarded as an extension of proteomics database searching, glycosylation requires specialized approaches. This is because glycans are large and heterogeneous by nature, causing glycopeptides to exist as multiple glycosylated variants. Thus, the mass of the peptide cannot be calculated directly from that of the intact glycopeptide. In addition, the chemical nature of the glycan strongly influences product ion patterns observed for glycopeptides. As a result, glycopeptidomics requires specialized bioinformatics methods. We summarize the recent progress towards a consensus for effective glycopeptide tandem mass spectrometric analysis.

[1]  M. F. Bean,et al.  Selective identification and differentiation of N‐and O‐linked oligosaccharides in glycoproteins by liquid chromatography‐mass spectrometry , 1993, Protein science : a publication of the Protein Society.

[2]  René Ranzinger,et al.  “Glyco‐peakfinder” – de novo composition analysis of glycoconjugates , 2007, Proteomics.

[3]  Scott A McLuckey,et al.  Complementary structural information from a tryptic N-linked glycopeptide via electron transfer ion/ion reactions and collision-induced dissociation. , 2005, Journal of proteome research.

[4]  Hanno Steen,et al.  SweetSEQer, Simple de Novo Filtering and Annotation of Glycoconjugate Mass Spectra , 2013, Molecular & Cellular Proteomics.

[5]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[6]  Stephan M. Winkler,et al.  MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra , 2014, Journal of proteome research.

[7]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[8]  A. Burlingame,et al.  Electron transfer dissociation (ETD): The mass spectrometric breakthrough essential for O‐GlcNAc protein site assignments—a study of the O‐GlcNAcylated protein Host Cell Factor C1 , 2013, Proteomics.

[9]  Manfred Wuhrer,et al.  Mass spectrometric glycan rearrangements. , 2011, Mass spectrometry reviews.

[10]  N. Leymarie,et al.  Confident Assignment of Site-Specific Glycosylation in Complex Glycoproteins in a Single Step , 2014, Journal of proteome research.

[11]  Tsung-Hsien Pu,et al.  Novel LC-MS² product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides. , 2014, Analytical chemistry.

[12]  Catherine A. Cooper,et al.  GlycoMod – A software tool for determining glycosylation compositions from mass spectrometric data , 2001, Proteomics.

[13]  Jian Min Ren,et al.  N-Glycan structure annotation of glycopeptides using a linearized glycan structure database (GlyDB). , 2007, Journal of proteome research.

[14]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[15]  Serenus Hua,et al.  Automated assignments of N- and O-site specific glycosylation with extensive glycan heterogeneity of glycoprotein mixtures. , 2013, Analytical chemistry.

[16]  Albert J R Heck,et al.  Toward full peptide sequence coverage by dual fragmentation combining electron-transfer and higher-energy collision dissociation tandem mass spectrometry. , 2012, Analytical chemistry.

[17]  P. Pevzner,et al.  Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong , 2011, Journal of the American Society for Mass Spectrometry.

[18]  J. Coon,et al.  A proteomics search algorithm specifically designed for high-resolution tandem mass spectra. , 2013, Journal of proteome research.

[19]  David M Rocke,et al.  A new computer program (GlycoX) to determine simultaneously the glycosylation sites and oligosaccharide heterogeneity of glycoproteins. , 2006, Journal of proteome research.

[20]  Ningombam Sanjib Meitei,et al.  Bioinformatics in glycomics: glycan characterization with mass spectrometric data using SimGlycan. , 2010, Methods in molecular biology.

[21]  William S Hancock,et al.  Combination of abundant protein depletion and multi-lectin affinity chromatography (M-LAC) for plasma protein biomarker discovery. , 2007, Journal of proteome research.

[22]  Lennart Martens,et al.  SearchGUI: An open‐source graphical user interface for simultaneous OMSSA and X!Tandem searches , 2011, Proteomics.

[23]  F. McLafferty,et al.  Activated ion electron capture dissociation for mass spectral sequencing of larger (42 kDa) proteins. , 2000, Analytical chemistry.

[24]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[25]  David Hua,et al.  GlycoPep grader: a web-based utility for assigning the composition of N-linked glycopeptides. , 2012, Analytical chemistry.

[26]  Chen-Chun Chen,et al.  MAGIC: an automated N-linked glycoprotein identification tool using a Y1-ion pattern matching algorithm and in silico MS² approach. , 2015, Analytical chemistry.

[27]  K. Biemann,et al.  Computer program (SEQPEP) to aid in the interpretation of high-energy collision tandem mass spectra of peptides. , 1989, Biomedical & environmental mass spectrometry.

[28]  E. Go,et al.  GlycoPep DB: a tool for glycopeptide analysis using a "Smart Search". , 2007, Analytical chemistry.

[29]  Stephen A. Martin,et al.  Mass spectrometric determination of the amino acid sequence of peptides and proteins , 1987 .

[30]  Xiaomeng Su,et al.  New Glycoproteomics Software, GlycoPep Evaluator, Generates Decoy Glycopeptides de Novo and Enables Accurate False Discovery Rate Analysis for Small Data Sets , 2014, Analytical chemistry.

[31]  Ruedi Aebersold,et al.  Mass Spectrometry Based Glycoproteomics—From a Proteomics Perspective* , 2010, Molecular & Cellular Proteomics.

[32]  R. Renkonen,et al.  De novo glycan structure search with the CID MS/MS spectra of native N-glycopeptides. , 2009, Glycobiology.

[33]  Knut Reinert,et al.  OpenMS and TOPP: open source software for LC-MS data analysis. , 2011, Methods in molecular biology.

[34]  David Goldberg,et al.  Automated N-glycopeptide identification using a combination of single- and tandem-MS. , 2007, Journal of proteome research.

[35]  Helen J Cooper,et al.  Higher energy collision dissociation (HCD) product ion-triggered electron transfer dissociation (ETD) mass spectrometry for the analysis of N-linked glycoproteins. , 2012, Journal of proteome research.

[36]  Zhikai Zhu,et al.  GlycoPep Detector: a tool for assigning mass spectrometry data of N-linked glycopeptides on the basis of their electron transfer dissociation spectra. , 2013, Analytical chemistry.

[37]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[38]  J. Brodbelt,et al.  Comparison of Glycopeptide Fragmentation by Collision Induced Dissociation and Ultraviolet Photodissociation. , 2015, International journal of mass spectrometry.

[39]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[40]  Daniel Figeys,et al.  Large-scale characterization of intact N-glycopeptides using an automated glycoproteomic method. , 2014, Journal of proteomics.

[41]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[42]  Catherine A. Hayes,et al.  UniCarb-DB: a database resource for glycomic discovery , 2011, Bioinform..

[43]  M. Mann,et al.  Andromeda: a peptide search engine integrated into the MaxQuant environment. , 2011, Journal of proteome research.

[44]  Joshua E. Elias,et al.  Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics , 2010, Proteome Bioinformatics.

[45]  N. Leymarie,et al.  Effective use of mass spectrometry for glycan and glycopeptide structural analysis. , 2012, Analytical chemistry.

[46]  A. Burlingame,et al.  Application of mass spectrometry to structure problems: condylocarpine. , 1962 .

[47]  Peter R. Baker,et al.  Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. , 1999, Analytical chemistry.

[48]  Haixu Tang,et al.  Mapping site-specific protein N-glycosylations through liquid chromatography/mass spectrometry and targeted tandem mass spectrometry. , 2010, Rapid communications in mass spectrometry : RCM.

[49]  Eric D. Dodds,et al.  A Classifier Based on Accurate Mass Measurements to Aid Large Scale, Unbiased Glycoproteomics* , 2013, Molecular & Cellular Proteomics.

[50]  Steven P Gygi,et al.  A probability-based approach for high-throughput protein phosphorylation analysis and site localization , 2006, Nature Biotechnology.

[51]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[52]  A G Marshall,et al.  Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptic to yield complementary sequence information. , 2001, Analytical chemistry.

[53]  Y. Mechref,et al.  Bioinformatics Protocols in Glycomics and Glycoproteomics , 2014, Current protocols in protein science.

[54]  D. Tabb,et al.  MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. , 2007, Journal of proteome research.

[55]  László Drahos,et al.  GlycoMiner: a new software tool to elucidate glycopeptide composition. , 2008, Rapid communications in mass spectrometry : RCM.

[56]  J. Cipollo,et al.  An unbiased approach for analysis of protein glycosylation and application to influenza vaccine hemagglutinin. , 2011, Analytical biochemistry.

[57]  B. Ma,et al.  GlycoMaster DB: software to assist the automated identification of N-linked glycopeptides by tandem mass spectrometry. , 2014, Journal of proteome research.

[58]  Suh-Yuen Liang,et al.  Sweet-Heart - an integrated suite of enabling computational tools for automated MS2/MS3 sequencing and identification of glycopeptides. , 2013, Journal of proteomics.

[59]  John D. Storey A direct approach to false discovery rates , 2002 .

[60]  M. Eichelberger,et al.  Comparative glycomics analysis of influenza Hemagglutinin (H5N1) produced in vaccine relevant cell platforms. , 2013, Journal of proteome research.

[61]  Nikolaos M. Nikolaidis,et al.  Mutagenesis of Surfactant Protein D Informed by Evolution and X-ray Crystallography Enhances Defenses against Influenza A Virus in Vivo* , 2011, The Journal of Biological Chemistry.

[62]  Feng Li,et al.  Glycobioinformatics: Current strategies and tools for data mining in MS‐based glycoproteomics , 2013, Proteomics.

[63]  Kiyoko F. Aoki-Kinoshita,et al.  UniCarbKB: building a knowledge platform for glycoproteomics , 2013, Nucleic Acids Res..

[64]  Daniel Kolarich,et al.  GlycoSpectrumScan: fishing glycopeptides from MS spectra of protease digests of human colostrum sIgA. , 2010, Journal of proteome research.

[65]  J. Eng,et al.  Comet: An open‐source MS/MS sequence database search tool , 2013, Proteomics.

[66]  William F. Martin,et al.  Automated glycopeptide analysis - review of current state and future directions , 2013, Briefings Bioinform..

[67]  Pavel A. Pevzner,et al.  UniNovo: A Universal Tool for de Novo Peptide Sequencing , 2013, RECOMB.

[68]  E. Go,et al.  Characterization of host-cell line specific glycosylation profiles of early transmitted/founder HIV-1 gp120 envelope proteins. , 2013, Journal of proteome research.

[69]  J. Brodbelt,et al.  Concurrent automated sequencing of the glycan and peptide portions of O-linked glycopeptide anions by ultraviolet photodissociation mass spectrometry. , 2013, Analytical chemistry.

[70]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[71]  André M Deelder,et al.  Protein glycosylation analysis by liquid chromatography-mass spectrometry. , 2005, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[72]  Sakari Joenväärä,et al.  N-glycoproteomics - an automated workflow approach. , 2008, Glycobiology.

[73]  Kiyoko F. Aoki-Kinoshita,et al.  UniCarbKB: Putting the pieces together for glycomics research , 2011, Proteomics.

[74]  Haixu Tang,et al.  Improving confidence in detection and characterization of protein N-glycosylation sites and microheterogeneity. , 2011, Rapid communications in mass spectrometry : RCM.

[75]  Lutgarde Arckens,et al.  Sweet Substitute: A software tool for in silico fragmentation of peptide‐linked N‐glycans , 2004, Proteomics.

[76]  P. Pevzner,et al.  The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search* , 2010, Molecular & Cellular Proteomics.

[77]  M. K. Young,et al.  Protein Identification Using a Quadrupole Ion Trap Mass Spectrometer and SEQUEST Database Matching , 2000, Current protocols in protein science.

[78]  Navdeep Jaitly,et al.  Decon2LS: An open-source software package for automated processing and visualization of high resolution mass spectrometry data , 2009, BMC Bioinformatics.

[79]  H. Desaire,et al.  When can glycopeptides be assigned based solely on high-resolution mass spectrometry data? , 2009 .

[80]  M F Bean,et al.  Collisional fragmentation of glycopeptides by electrospray ionization LC/MS and LC/MS/MS: methods for selective detection of glycopeptides in protein digests. , 1993, Analytical chemistry.

[81]  Bin Ma,et al.  PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification* , 2011, Molecular & Cellular Proteomics.

[82]  Martin Frank,et al.  Glycome-DB.org: a portal for querying across the digital world of carbohydrate sequences. , 2009, Glycobiology.

[83]  Haixu Tang,et al.  Computational framework for identification of intact glycopeptides in complex samples. , 2014, Analytical chemistry.

[84]  K. Biemann,et al.  APPLICATION OF MASS SPECTROMETRY TO STRUCTURE PROBLEMS. I. AMINO ACID SEQUENCE IN PEPTIDES , 1959 .

[85]  V. Reinhold,et al.  Structural characterization of carbohydrate sequence, linkage, and branching in a quadrupole Ion trap mass spectrometer: neutral oligosaccharides and N-linked glycans. , 1998, Analytical chemistry.

[86]  P. Pevzner,et al.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. , 2008, Journal of proteome research.

[87]  Yehia Mechref,et al.  Combining lectin microcolumns with high-resolution separation techniques for enrichment of glycoproteins and glycopeptides. , 2005, Analytical chemistry.

[88]  Natalie I. Tasman,et al.  iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates* , 2011, Molecular & Cellular Proteomics.

[89]  William Stafford Noble,et al.  Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry , 2008, ECCB.

[90]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[91]  Radoslav Goldman,et al.  Semi-automated identification of N-Glycopeptides by hydrophilic interaction chromatography, nano-reverse-phase LC-MS/MS, and glycan database search. , 2012, Journal of proteome research.

[92]  Yong J. Kil,et al.  Byonic: Advanced Peptide and Protein Identification Software , 2012, Current protocols in bioinformatics.

[93]  V. Reinhold,et al.  Detailed characterization of carbohydrate linkage and sequence in an ion trap mass spectrometer: glycosphingolipids. , 1998, Analytical biochemistry.

[94]  Heather Desaire,et al.  Software for automated interpretation of mass spectrometry data from glycans and glycopeptides. , 2013, The Analyst.

[95]  Yehia Mechref,et al.  Use of CID/ETD Mass Spectrometry to Analyze Glycopeptides , 2012, Current protocols in protein science.

[96]  M. Mann,et al.  Of protons or proteins , 1988 .

[97]  M. Mann,et al.  Decoding signalling networks by mass spectrometry-based proteomics , 2010, Nature Reviews Molecular Cell Biology.