iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.

Structural annotation of metabolites relies mainly on tandem mass spectrometry (MS/MS) analysis. However, approximately 90% of the known metabolites reported in metabolomic databases do not have annotated spectral data from standards. This situation has fostered the development of computational tools that predict fragmentation patterns in silico and compare these to experimental MS/MS spectra. However, because such methods require the molecular structure of the detected compound to be available for the algorithm, the identification of novel metabolites in organisms relevant for biotechnological and medical applications remains a challenge. Here, we present iMet, a computational tool that facilitates structural annotation of metabolites not described in databases. iMet uses MS/MS spectra and the exact mass of an unknown metabolite to identify metabolites in a reference database that are structurally similar to the unknown metabolite. The algorithm also suggests the chemical transformation that converts the known metabolites into the unknown one. As a proxy for the structural annotation of novel metabolites, we tested 148 metabolites following a leave-one-out cross-validation procedure or by using MS/MS spectra experimentally obtained in our laboratory. We show that for 89% of the 148 metabolites at least one of the top four matches identified by iMet enables the proper annotation of the unknown metabolites. To further validate iMet, we tested 31 metabolites proposed in the 2012-16 CASMI challenges. iMet is freely available at http://imet.seeslab.net .

[1]  Steven Lai,et al.  MolFind: a software package enabling HPLC/MS-based identification of unknown chemical structures. , 2012, Analytical chemistry.

[2]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[3]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[4]  R. Breitling,et al.  Precision mapping of the metabolome. , 2006, Trends in biotechnology.

[5]  Emma L. Schymanski,et al.  Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects , 2016 .

[6]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[7]  Susan C. Connor,et al.  Assignment of MS-based metabolomic datasets via compound interaction pair mapping , 2008, Metabolomics.

[8]  Ralf Tautenhahn,et al.  An accelerated workflow for untargeted metabolomics using the METLIN database , 2012, Nature Biotechnology.

[9]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[10]  Emma L. Schymanski,et al.  MetFrag relaunched: incorporating strategies beyond in silico fragmentation , 2016, Journal of Cheminformatics.

[11]  Douglas B. Kell,et al.  The metabolome 18 years on: a concept comes of age , 2016, Metabolomics.

[12]  Diether Lambrechts,et al.  Improved metabolite identification with MIDAS and MAGMa through MS/MS spectral dataset-driven parameter optimization , 2016, Metabolomics.

[13]  G. Maggiora,et al.  Molecular similarity in medicinal chemistry. , 2014, Journal of medicinal chemistry.

[14]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[15]  Simon Rogers,et al.  Probabilistic assignment of formulas to mass peaks in metabolomics experiments , 2009, Bioinform..

[16]  R. Bino,et al.  In silico prediction and automatic LC-MS(n) annotation of green tea metabolites in urine. , 2014, Analytical chemistry.

[17]  Ernest Fraenkel,et al.  Revealing disease-associated pathways by network integration of untargeted metabolomics , 2016, Nature Methods.

[18]  Oliver Fiehn,et al.  MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics , 2015, Journal of Cheminformatics.

[19]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[20]  C. Clish,et al.  Novel Functional Sets of Lipid-Derived Mediators with Antiinflammatory Actions Generated from Omega-3 Fatty Acids via Cyclooxygenase 2–Nonsteroidal Antiinflammatory Drugs and Transcellular Processing , 2000, The Journal of experimental medicine.

[21]  Juho Rousu,et al.  Metabolite identification and molecular fingerprint prediction through machine learning , 2012, Bioinform..

[22]  P. Borst,et al.  N-lactoyl-amino acids are ubiquitous metabolites that originate from CNDP2-mediated reverse proteolysis of lactate and amino acids , 2015, Proceedings of the National Academy of Sciences.

[23]  Shuzhao Li,et al.  Predicting Network Activity from High Throughput Metabolomics , 2013, PLoS Comput. Biol..

[24]  S. Böcker,et al.  Searching molecular structure databases with tandem mass spectra using CSI:FingerID , 2015, Proceedings of the National Academy of Sciences.

[25]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[26]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[27]  Yun-Han Huang,et al.  Functional metagenomic discovery of bacterial effectors in the human microbiome and isolation of commendamide, a GPCR G2A/132 agonist , 2015, Proceedings of the National Academy of Sciences.

[28]  Steffen Neumann,et al.  MetFusion: integration of compound identification strategies. , 2013, Journal of mass spectrometry : JMS.

[29]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[30]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[31]  Yvan Saeys,et al.  Systematic Structural Characterization of Metabolites in Arabidopsis via Candidate Substrate-Product Pair Networks[C][W] , 2014, Plant Cell.

[32]  Sanguthevar Rajasekaran,et al.  Metabolic Pathway Predictions for Metabolomics: A Molecular Structure Matching Approach , 2015, J. Chem. Inf. Model..

[33]  G. Patti,et al.  An untargeted metabolomic workflow to improve structural characterization of metabolites. , 2013, Analytical chemistry.

[34]  Justin J J van der Hooft,et al.  Metabolite identification using automated comparison of high-resolution multistage mass spectral trees. , 2012, Analytical chemistry.

[35]  Kazuki Saito,et al.  Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software. , 2016, Analytical chemistry.

[36]  Oliver Fiehn,et al.  MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra , 2014, Analytical chemistry.

[37]  S. Neumann,et al.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. , 2012, Analytical chemistry.

[38]  G. Siuzdak,et al.  Identification of a new endogenous metabolite and the characterization of its protein interactions through an immobilization approach. , 2009, Journal of the American Chemical Society.

[39]  Matthias Müller-Hannemann,et al.  In silico fragmentation for computer assisted identification of metabolite mass spectra , 2010, BMC Bioinformatics.

[40]  Zhen Ji,et al.  HAMMER: automated operation of mass frontier to construct in silico mass spectral fragmentation libraries , 2013, Bioinform..

[41]  David S. Wishart,et al.  CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra , 2014, Nucleic Acids Res..

[42]  Oliver Fiehn,et al.  LipidBlast - in-silico tandem mass spectrometry database for lipid identification , 2013, Nature Methods.

[43]  R. Abagyan,et al.  METLIN: A Metabolite Mass Spectral Database , 2005, Therapeutic drug monitoring.

[44]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[45]  A. Fairlamb,et al.  Bis(glutathionyl)spermine and Other Novel Trypanothione Analogues in Trypanosoma cruzi* , 2003, Journal of Biological Chemistry.

[46]  Tao Huan,et al.  MyCompoundID: using an evidence-based metabolome library for metabolite identification. , 2013, Analytical chemistry.

[47]  Mark R. Viant,et al.  MI-Pack: Increased confidence of metabolite identification in mass spectra by integrating accurate masses and metabolic pathways , 2010 .

[48]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[49]  Wen-Lian Hsu,et al.  Metabolite identification for mass spectrometry-based metabolomics using multiple types of correlated ion information. , 2015, Analytical chemistry.

[50]  Douglas B. Kell,et al.  Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets , 2011, Bioinform..