Automatic annotation of matrix‐assisted laser desorption/ionization N‐glycan spectra

Matrix‐assisted laser desorption/ionization‐mass spectrometry (MALDI‐MS) is the pre‐eminent technique for mass mapping of glycans. In order to make this technique practical for high‐throughput screening, reliable automatic methods of annotating peaks must be devised. We describe an algorithm called Cartoonist that labels peaks in MALDI spectra of permethylated N‐glycans with cartoons which represent the most plausible glycans consistent with the peak masses and the types of glycans being analyzed. There are three main parts to Cartoonist. (i) It selects annotations from a library of biosynthetically plausible cartoons. The library we currently use has about 2800 cartoons, but was constructed using only about 300 archetype cartoons entered by hand. (ii) It determines the precision and calibration of the machine used to generate the spectrum. It does this automatically based on the spectrum itself. (iii) It assigns a confidence score to each annotation. In particular, rather than making a binary yes/no decision when annotating a peak, it makes all plausible annotations and associates them with scores indicating the probability that they are correct.

[1]  A. Dell,et al.  Glycoprotein Structure Determination by Mass Spectrometry , 2001, Science.

[2]  Mark Sutton-Smith,et al.  Murine and human zona pellucida 3 derived from mouse eggs express identical O-glycans , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Niclas G Karlsson,et al.  Development of a mass fingerprinting tool for automated interpretation of oligosaccharide fragmentation data , 2004, Proteomics.

[4]  Claus-W von der Lieth,et al.  GLYCO‐FRAGMENT: A web tool to support the interpretation of mass spectra of complex carbohydrates , 2003, Proteomics.

[5]  J. Peter-Katalinic,et al.  Fully automated chip-based mass spectrometry for complex carbohydrate system analysis. , 2004, Analytical chemistry.

[6]  Hélène Perreault,et al.  Automated structural assignment of derivatized complex N-linked oligosaccharides from tandem mass spectra. , 2002, Rapid communications in mass spectrometry : RCM.

[7]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[8]  H. Schachter,et al.  The 'yellow brick road' to branched complex N-glycans. , 1991, Glycobiology.

[9]  Mark Sutton-Smith,et al.  A rapid mass spectrometric strategy suitable for the investigation of glycan alterations in knockout mice , 2000 .

[10]  J. Leary,et al.  STAT: a saccharide topology analysis tool used in combination with tandem mass spectrometry. , 2000, Analytical chemistry.

[11]  J. A. Taylor,et al.  Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. , 2001, Analytical chemistry.

[12]  S. Kornfeld,et al.  Assembly of asparagine-linked oligosaccharides. , 1985, Annual review of biochemistry.

[13]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[14]  J. Marth,et al.  A genetic approach to Mammalian glycan function. , 2003, Annual review of biochemistry.

[15]  A. Herscovics,et al.  Importance of glycosidases in mammalian glycoprotein biosynthesis. , 1999, Biochimica et Biophysica Acta.

[16]  C. Abeijon,et al.  Topography of glycosylation reactions in the endoplasmic reticulum. , 1992, Trends in biochemical sciences.

[17]  J. E. Celis,et al.  Cell Biology: A Laboratory Handbook , 1997 .

[18]  J. Marth,et al.  Genetic remodeling of protein glycosylation in vivo induces autoimmune disease. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Mark Sutton-Smith,et al.  Characterization of the Oligosaccharides Associated with the Human Ovarian Tumor Marker CA125* , 2003, Journal of Biological Chemistry.

[20]  Catherine A. Cooper,et al.  GlycoMod – A software tool for determining glycosylation compositions from mass spectrometric data , 2001, Proteomics.

[21]  David Fenyö,et al.  RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database , 2002, Proteomics.

[22]  R. Campbell,et al.  Modeling human congenital disorder of glycosylation type IIa in the mouse: conservation of asparagine-linked glycan-dependent functions in mammalian physiology and insights into disease pathogenesis. , 2001, Glycobiology.

[23]  D. Harvey,et al.  Identification of protein‐bound carbohydrates by mass spectrometry , 2001, Proteomics.