A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks

This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.

[1]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[2]  Michael P. Barrett,et al.  MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks , 2010, Nucleic Acids Res..

[3]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[4]  Bernhard O. Palsson,et al.  BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions , 2010, BMC Bioinformatics.

[5]  Rainer Breitling,et al.  TrypanoCyc: a community-led biochemical pathways database for Trypanosoma brucei , 2014, Nucleic Acids Res..

[6]  Thomas Bernard,et al.  MetaNetX.org: a website and repository for accessing, analysing and manipulating metabolic networks , 2013, Bioinform..

[7]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[8]  Thomas Bernard,et al.  Reconciliation of metabolites and biochemical reactions for metabolic networks , 2012, Briefings Bioinform..

[9]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[10]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[11]  Jingfa Xiao,et al.  Bioinformatics clouds for big data manipulation , 2012, Biology Direct.

[12]  Matej Oresic,et al.  COordination of Standards in MetabOlomicS (COSMOS): facilitating integrated metabolomics data access , 2015, Metabolomics.

[13]  Rawi Ramautar,et al.  Human metabolomics: strategies to understand biology. , 2013, Current opinion in chemical biology.

[14]  J. Lindon,et al.  'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. , 1999, Xenobiotica; the fate of foreign compounds in biological systems.

[15]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[16]  Nicolas Le Novère,et al.  BioModels linked dataset , 2014, BMC Systems Biology.

[17]  Antony J. Williams,et al.  ChemSpider:: An Online Chemical Information Resource , 2010 .

[18]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[19]  Jakub Galgonek,et al.  On InChI and evaluating the quality of cross-reference links , 2014, Journal of Cheminformatics.

[20]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[21]  K. Smallbone Striking a balance with Recon 2.1 , 2013, 1311.5696.

[22]  Emma L. Schymanski,et al.  Metabolite identification: are you sure? And how do your peers gauge your confidence? , 2014, Metabolomics.

[23]  Robert Petryszak,et al.  UniChem: a unified chemical structure cross-referencing and identifier tracking system , 2013, Journal of Cheminformatics.

[24]  O. Fiehn,et al.  Metabolite profiling for plant functional genomics , 2000, Nature Biotechnology.

[25]  Christoph Steinbeck,et al.  Metingear: a development environment for annotating genome-scale metabolic models , 2013, Bioinform..

[26]  Emilien L. Jamin,et al.  ProbMetab : an R package for Bayesian probabilistic annotation of LC-MS based metabolomics , 2013 .

[27]  Daniel Jacob,et al.  Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics , 2014, Bioinform..

[28]  Ronan M. T. Fleming,et al.  Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2 , 2014, Journal of Cheminformatics.

[29]  N. Null The IUPAC International Chemical Identifier (InChI) , 2009 .

[30]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metpa: a Web-based Metabolomics Tool for Pathway Analysis and Visualization , 2022 .

[31]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[32]  Masanori Arita,et al.  Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis , 2010, BMC Bioinformatics.

[33]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[34]  Stephen R. Heller,et al.  InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.

[35]  Peter D. Karp,et al.  Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology , 2016, Briefings Bioinform..

[36]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .

[37]  R. Overbeek,et al.  Automated genome annotation and metabolic model reconstruction in the SEED and Model SEED. , 2013, Methods in molecular biology.

[38]  Christoph Steinbeck,et al.  The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013 , 2012, Nucleic Acids Res..

[39]  V. de Crécy-Lagard,et al.  Mining high-throughput experimental data to link gene and function. , 2011, Trends in biotechnology.

[40]  Egon L. Willighagen,et al.  The Chemical Translation Service—a web-based tool to improve standardization of metabolomic reports , 2010, Bioinform..

[41]  Simon Rogers,et al.  A Bayesian regression approach to the inference of regulatory networks from gene expression data , 2005, Bioinform..