Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.

Analysis of a single analyte by mass spectrometry can result in the detection of more than 100 degenerate peaks. These degenerate peaks complicate spectral interpretation and are challenging to annotate. In mass spectrometry-based metabolomics, this degeneracy leads to inflated false discovery rates, data sets containing an order of magnitude more features than analytes, and an inefficient use of resources during data analysis. Although software has been introduced to annotate spectral degeneracy, current approaches are unable to represent several important classes of peak relationships. These include heterodimers and higher complex adducts, distal fragments, relationships between peaks in different polarities, and complex adducts between features and background peaks. Here we outline sources of peak degeneracy in mass spectra that are not annotated by current approaches and introduce a software package called mz.unity to detect these relationships in accurate mass data. Using mz.unity, we find that data sets contain many more complex relationships than we anticipated. Examples include the adduct of glutamate and nicotinamide adenine dinucleotide (NAD), fragments of NAD detected in the same or opposite polarities, and the adduct of glutamate and a background peak. Further, the complex relationships we identify show that several assumptions commonly made when interpreting mass spectral degeneracy do not hold in general. These contributions provide new tools and insight to aid in the annotation of complex spectral relationships and provide a foundation for improved data set identification. Mz.unity is an R package and is freely available at https://github.com/nathaniel-mahieu/mz.unity as well as our laboratory Web site http://pattilab.wustl.edu/software/ .

[1]  Matthias Müller-Hannemann,et al.  In silico fragmentation for computer assisted identification of metabolite mass spectra , 2010, BMC Bioinformatics.

[2]  Rainer Breitling,et al.  MetAssign: probabilistic annotation of metabolites from LC–MS data using a Bayesian clustering approach , 2014, Bioinform..

[3]  Bin Wang,et al.  Oncometabolite 2-hydroxyglutarate is a competitive inhibitor of α-ketoglutarate-dependent dioxygenases. , 2011, Cancer cell.

[4]  Alexandre Perera-Lluna,et al.  An R package to analyse LC/MS metabolomic data: MAIT (Metabolite Automatic Identification Toolkit) , 2014, Bioinform..

[5]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[6]  Omar Abdel-Wahab,et al.  The common feature of leukemia-associated IDH1 and IDH2 mutations is a neomorphic enzyme activity converting alpha-ketoglutarate to 2-hydroxyglutarate. , 2010, Cancer cell.

[7]  Juho Rousu,et al.  Metabolite identification through multiple kernel learning on fragmentation trees , 2014, Bioinform..

[8]  Nathaniel G Mahieu,et al.  Credentialing Features: A Platform to Benchmark and Optimize Untargeted Metabolomic Methods , 2014, Analytical chemistry.

[9]  Kevin Cho,et al.  Evidence that 2-hydroxyglutarate is not readily metabolized in colorectal carcinoma cells , 2015, Cancer & Metabolism.

[10]  Simon Rogers,et al.  Probabilistic assignment of formulas to mass peaks in metabolomics experiments , 2009, Bioinform..

[11]  Sebastian Böcker,et al.  Fragmentation trees reloaded , 2014, Journal of Cheminformatics.

[12]  Xianlin Han,et al.  Shotgun lipidomics of neutral lipids as an enabling technology for elucidation of lipid-related diseases. , 2009, American journal of physiology. Endocrinology and metabolism.

[13]  Wanchang Lin,et al.  Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules' , 2009, BMC Bioinformatics.

[14]  Zhentian Lei,et al.  MET-COFEA: a liquid chromatography/mass spectrometry data processing platform for metabolite compound feature extraction and annotation. , 2014, Analytical chemistry.

[15]  G Madalinski,et al.  Fourier transform mass spectrometry for metabolome analysis. , 2010, The Analyst.

[16]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[17]  Ralf Tautenhahn,et al.  Toward 'omic scale metabolite profiling: a dual separation-mass spectrometry approach for coverage of lipid and central carbon metabolism. , 2013, Analytical chemistry.

[18]  M. Mann,et al.  Electrospray ionization for mass spectrometry of large biomolecules. , 1989, Science.

[19]  Gary J. Patti,et al.  X13CMS: Global Tracking of Isotopic Labels in Untargeted Metabolomics , 2014, Analytical chemistry.

[20]  Knut Reinert,et al.  An iterative strategy for precursor ion selection for LC-MS/MS based shotgun proteomics. , 2009, Journal of proteome research.

[21]  S. Neumann,et al.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. , 2012, Analytical chemistry.

[22]  Timothy M. D. Ebbels,et al.  A Statistically Rigorous Test for the Identification of Parent−Fragment Pairs in LC-MS Datasets , 2010, Analytical chemistry.

[23]  Gerhard Eckel,et al.  High molecular diversity of extraterrestrial organic matter in Murchison meteorite revealed 40 years after its fall , 2010, Proceedings of the National Academy of Sciences.

[24]  Krista L Vikse,et al.  Solvent effects on surface activity of aggregate ions in electrospray ionization , 2014 .

[25]  S. Levinson,et al.  Considerations in dynamic time warping algorithms for discrete word recognition , 1978 .

[26]  Steffen Neumann,et al.  Highly sensitive feature detection for high resolution LC/MS , 2008, BMC Bioinformatics.