Differentiating signals to make biological sense – A guide through databases for MS‐based non‐targeted metabolomics

Metabolite identification is one of the most challenging steps in metabolomics studies and reflects one of the greatest bottlenecks in the entire workflow. The success of this step determines the success of the entire research, therefore the quality at which annotations are given requires special attention. A variety of tools and resources are available to aid metabolite identification or annotation, offering different and often complementary functionalities. In preparation for this article, almost 50 databases were reviewed, from which 17 were selected for discussion, chosen for their online ESI‐MS functionality. The general characteristics and functions of each database is discussed in turn, considering the advantages and limitations of each along with recommendations for optimal use of each tool, as derived from experiences encountered at the Centre for Metabolomics and Bioanalysis (CEMBIO) in Madrid. These databases were evaluated considering their utility in non‐targeted metabolomics, including aspects such as identifier assignment, structural assignment and interpretation of results.

[1]  Oliver Fiehn,et al.  MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics , 2015, Journal of Cheminformatics.

[2]  Caroline H. Johnson,et al.  Metabolomics: beyond biomarkers and towards mechanisms , 2016, Nature Reviews Molecular Cell Biology.

[3]  Lars Ridder,et al.  Substructure-based annotation of high-resolution multistage MS(n) spectral trees. , 2012, Rapid communications in mass spectrometry : RCM.

[4]  S. Böcker,et al.  Searching molecular structure databases with tandem mass spectra using CSI:FingerID , 2015, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Yi-Fan Xu,et al.  Avoiding misannotation of in-source fragmentation products as cellular metabolites in liquid chromatography-mass spectrometry-based metabolomics. , 2015, Analytical chemistry.

[6]  Benjamin P Bowen,et al.  Dealing with the unknown: Metabolomics and Metabolite Atlases , 2010, Journal of the American Society for Mass Spectrometry.

[7]  C. Barbas,et al.  Rapid and Reliable Identification of Phospholipids for Untargeted Metabolomics with LC-ESI-QTOF-MS/MS. , 2015, Journal of proteome research.

[8]  Wanchang Lin,et al.  Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules' , 2009, BMC Bioinformatics.

[9]  Steffen Neumann,et al.  MetFusion: integration of compound identification strategies. , 2013, Journal of mass spectrometry : JMS.

[10]  Eoin Fahy,et al.  Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools , 2015, Nucleic Acids Res..

[11]  Ronghong Li,et al.  MyCompoundID MS/MS Search: Metabolite Identification Using a Library of Predicted Fragment-Ion-Spectra of 383,830 Possible Human Metabolites. , 2015, Analytical chemistry.

[12]  Tianwei Yu,et al.  K-Profiles: A Nonlinear Clustering Method for Pattern Detection in High Dimensional Data , 2015, BioMed research international.

[13]  Emma L. Schymanski,et al.  MetFrag relaunched: incorporating strategies beyond in silico fragmentation , 2016, Journal of Cheminformatics.

[14]  Coral Barbas,et al.  From numbers to a biological sense: How the strategy chosen for metabolomics data treatment may affect final results. A practical example based on urine fingerprints obtained by LC‐MS , 2013, Electrophoresis.

[15]  Matthias Müller-Hannemann,et al.  In silico fragmentation for computer assisted identification of metabolite mass spectra , 2010, BMC Bioinformatics.

[16]  Ting-Wen Chen,et al.  DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection , 2010, BMC Bioinformatics.

[17]  M. Ford,et al.  CD154 Blockade Alters Innate Immune Cell Recruitment and Programs Alloreactive CD8+ T Cells into KLRG-1high Short-Lived Effector T Cells , 2012, PloS one.

[18]  Xuan-xian Peng,et al.  Functional metabolomics: from biomarker discovery to metabolome reprogramming , 2015, Protein & Cell.

[19]  Charles R. Severance John Resig: Building JQuery , 2015, Computer.

[20]  David S. Wishart,et al.  CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra , 2014, Nucleic Acids Res..

[21]  Tao Huan,et al.  MyCompoundID: using an evidence-based metabolome library for metabolite identification. , 2013, Analytical chemistry.

[22]  Sanguthevar Rajasekaran,et al.  A molecular structure matching approach to efficient identification of endogenous mammalian biochemical structures , 2015, BMC Bioinformatics.

[23]  D. Kell,et al.  Mass Spectrometry Tools and Metabolite-specific Databases for Molecular Identification in Metabolomics , 2009 .

[24]  Douglas B. Kell,et al.  Molecular phenotyping of a UK population: defining the human serum metabolome , 2014, Metabolomics.

[25]  Oliver Fiehn,et al.  Extending Biochemical Databases by Metabolomic Surveys* , 2011, The Journal of Biological Chemistry.

[26]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[27]  Emma L. Schymanski,et al.  Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects , 2016 .

[28]  David S Wishart,et al.  Computational strategies for metabolite identification in metabolomics. , 2009, Bioanalysis.

[29]  R. Abagyan,et al.  METLIN: A Metabolite Mass Spectral Database , 2005, Therapeutic drug monitoring.

[30]  Chittibabu Guda,et al.  LMPD: LIPID MAPS proteome database , 2005, Nucleic Acids Res..

[31]  Karl Fraser,et al.  Computational Analyses of Spectral Trees from Electrospray Multi-Stage Mass Spectrometry to Aid Metabolite Identification , 2013, Metabolites.

[32]  Coral Barbas,et al.  In‐source fragmentation and correlation analysis as tools for metabolite identification exemplified with CE‐TOF untargeted metabolomics , 2015, Electrophoresis.

[33]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[34]  Christophe Junot,et al.  Mass spectrometry for the identification of the discriminating signals from metabolomics: current status and future trends. , 2008, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[35]  E. Yasugi,et al.  [LIPIDBANK for Web, the newly developed lipid database]. , 2002, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[36]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[37]  Coral Barbas,et al.  A Single In-Vial Dual Extraction Strategy for the Simultaneous Lipidomics and Proteomics Analysis of HDL and LDL Fractions. , 2016, Journal of proteome research.

[38]  David I. Ellis,et al.  Metabolomics: Current analytical platforms and methodologies , 2005 .

[39]  R. Brüschweiler,et al.  Emerging new strategies for successful metabolite identification in metabolomics. , 2016, Bioanalysis.

[40]  Juho Rousu,et al.  Metabolite identification through multiple kernel learning on fragmentation trees , 2014, Bioinform..

[41]  João D. Ferreira,et al.  Improving chemical entity recognition through h-index based semantic similarity , 2015, Journal of Cheminformatics.

[42]  Karsten Suhre,et al.  MassTRIX: mass translator into pathways , 2008, Nucleic Acids Res..

[43]  Mei Liu,et al.  Assessing reliability of protein-protein interactions by integrative analysis of data in model organisms , 2009, BMC Bioinformatics.

[44]  Bart C. Weimer,et al.  Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction , 2015, BMC Bioinformatics.

[45]  大房 健 基礎講座 電気泳動(Electrophoresis) , 2005 .

[46]  Justin J J van der Hooft,et al.  Metabolite identification using automated comparison of high-resolution multistage mass spectral trees. , 2012, Analytical chemistry.

[47]  Daniel Jacob,et al.  Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics , 2014, Bioinform..

[48]  Florian Rasche,et al.  De novo analysis of electron impact mass spectra using fragmentation trees. , 2012, Analytica chimica acta.

[49]  Brian Whiting,et al.  THERAPEUTIC DRUG MONITORING , 1982, The Lancet.

[50]  Liliane Mouawad,et al.  vSDC: a method to improve early recognition in virtual screening when limited experimental resources are available , 2016, Journal of Cheminformatics.

[51]  Michael Witting,et al.  MassTRIX Reloaded: Combined Analysis and Visualization of Transcriptome and Metabolome Data , 2012, PloS one.

[52]  A. Venter,et al.  Journal of The American Society for Mass Spectrometry , 2005, Journal of the American Society for Mass Spectrometry.