Bioinformatics Tools for the Interpretation of Metabolomics Data

Purpose of ReviewMetabolomics is a rapidly evolving field that generates large and complex datasets. Bioinformatics becomes critical towards the extraction of meaningful biological information. In this article, we briefly review computational approaches that have been well accepted in the field, and discuss the development of new methods and tools to interpret metabolomics data.Recent FindingsSignificant progress has been made in computational metabolomics over the past years. This includes methods that are used to preprocess data generated by instruments, to annotate metabolites, to carry out statistical analyses, to identify perturbed metabolic pathways, and to integrate metabolomics with other omics data. Each of these topics is discussed in respective sections of this review.SummaryBioinformatics tools used for metabolomics remain a highly active research area. An ecosystem is emerging with software libraries, standalone tools, and web-based tools and services. While some require bioinformatics training, many of them are user friendly and easily accessible. Much further development is still needed to serve the metabolomics field and its applications.

[1]  Shuzhao Li,et al.  Effects of age, sex, and genotype on high-sensitivity metabolomic profiles in the fruit fly, Drosophila melanogaster , 2014, Aging cell.

[2]  G. Siuzdak,et al.  XCMS Online: a web-based platform to process untargeted metabolomic data. , 2012, Analytical chemistry.

[3]  William Stafiord Noble,et al.  Support vector machine applications in computational biology , 2004 .

[4]  Karan Uppal,et al.  xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data. , 2017, Analytical chemistry.

[5]  Daniel Eriksson,et al.  Data integration in plant biology: the O2PLS method for combined modeling of transcript and metabolite data. , 2007, The Plant journal : for cell and molecular biology.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  John P. Overington,et al.  An atlas of genetic influences on human blood metabolites , 2014, Nature Genetics.

[8]  Shuzhao Li,et al.  Vaccine Activation of the Nutrient Sensor GCN2 in Dendritic Cells Enhances Antigen Presentation , 2014, Science.

[9]  David S. Wishart,et al.  MetaboAnalyst 3.0—making metabolomics more meaningful , 2015, Nucleic Acids Res..

[10]  Ernesto S. Nakayasu,et al.  Model-driven multi-omic data analysis elucidates metabolic immunomodulators of macrophage activation , 2012, Molecular systems biology.

[11]  S Neumann,et al.  RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. , 2014, Analytical chemistry.

[12]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[13]  Douglas N. Rutledge,et al.  Can we trust untargeted metabolomics? Results of the metabo-ring initiative, a large-scale, multi-instrument inter-laboratory study , 2014, Metabolomics.

[14]  Jos Kleinjans,et al.  Transcriptomic and metabolomic data integration , 2016, Briefings Bioinform..

[15]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[16]  Shuzhao Li,et al.  Correlation of the lung microbiota with metabolic profiles in bronchoalveolar lavage fluid in HIV infection , 2016, Microbiome.

[17]  Rosa D. Hernansaiz-Ballesteros,et al.  Babelomics 5.0: functional interpretation for new generations of genomic data , 2015, Nucleic Acids Res..

[18]  Masanori Arita,et al.  MS-DIAL: Data Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis , 2015, Nature Methods.

[19]  Shuzhao Li,et al.  Predicting Network Activity from High Throughput Metabolomics , 2013, PLoS Comput. Biol..

[20]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[21]  Caroline H. Johnson,et al.  Metabolomics: beyond biomarkers and towards mechanisms , 2016, Nature Reviews Molecular Cell Biology.

[22]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[23]  M. Pagano,et al.  Student's t test. , 1993, Nutrition.

[24]  Kristina M. Hettne,et al.  Integration of targeted metabolomics and transcriptomics identifies deregulation of phosphatidylcholine metabolism in Huntington’s disease peripheral blood samples , 2016, Metabolomics.

[25]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[26]  Jerzy Adamski,et al.  Genome-wide association studies with metabolomics , 2012, Genome Medicine.

[27]  Shuzhao Li,et al.  Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data. , 2017, Analytical chemistry.

[28]  Yan Ni,et al.  An automated data analysis pipeline for GC-TOF-MS metabonomics studies. , 2010, Journal of proteome research.

[29]  Shuzhao Li,et al.  Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information , 2016, Biometrics.

[30]  Alexander Gordon,et al.  Control of the mean number of false discoveries, Bonferroni and stability of multiple testing , 2007, 0709.0366.

[31]  R. Abagyan,et al.  METLIN: A Metabolite Mass Spectral Database , 2005, Therapeutic drug monitoring.

[32]  Mikhail S. Gelfand,et al.  Neanderthal ancestry drives evolution of lipid catabolism in contemporary Europeans , 2014, Nature Communications.

[33]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metpa: a Web-based Metabolomics Tool for Pathway Analysis and Visualization , 2022 .

[34]  Karan Uppal,et al.  Reference Standardization for Mass Spectrometry and High-resolution Metabolomics Applications to Exposome Research. , 2015, Toxicological sciences : an official journal of the Society of Toxicology.

[35]  Fabian J Theis,et al.  Multi-omic signature of body weight change: results from a population-based cohort study , 2015, BMC Medicine.

[36]  Rainer Breitling,et al.  MetAssign: probabilistic annotation of metabolites from LC–MS data using a Bayesian clustering approach , 2014, Bioinform..

[37]  Mark R. Viant,et al.  Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data , 2016, GigaScience.

[38]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[39]  Miguel Rocha,et al.  An R package for the integrated analysis of metabolomics and spectral data , 2016, Comput. Methods Programs Biomed..

[40]  Gopinath Ganji,et al.  Integration of genomic and metabonomic data in systems biology--are we 'there' yet? , 2006, Current opinion in drug discovery & development.

[41]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[42]  Gary Siuzdak,et al.  Bioinformatics: The Next Frontier of Metabolomics , 2014, Analytical chemistry.

[43]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[44]  Thomas W. MacFarland,et al.  Mann–Whitney U Test , 2016 .

[45]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[46]  Jody C. May,et al.  Advanced Multidimensional Separations in Mass Spectrometry: Navigating the Big Data Deluge. , 2016, Annual review of analytical chemistry.

[47]  Philippe Besse,et al.  Sparse canonical methods for biological data integration: application to a cross-platform study , 2009, BMC Bioinformatics.

[48]  Silas Granato Villas-Bôas,et al.  Metab: an R package for high-throughput analysis of metabolomics data generated by GC-MS , 2011, Bioinform..

[49]  C. Gieger,et al.  Genetics of human metabolism: an update , 2015, Human molecular genetics.

[50]  Ute Hofmann,et al.  Comprehensive Metabolomic and Lipidomic Profiling of Human Kidney Tissue: A Platform Comparison. , 2017, Journal of proteome research.

[51]  John L Markley,et al.  Metabolite identification via the Madison Metabolomics Consortium Database , 2008, Nature Biotechnology.

[52]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[53]  J. Lindon,et al.  Scaling and normalization effects in NMR spectroscopic metabonomic data sets. , 2006, Analytical chemistry.

[54]  Ignacio González,et al.  integrOmics: an R package to unravel relationships between two omics datasets , 2009, Bioinform..

[55]  Tianwei Yu,et al.  apLCMS - adaptive processing of high-resolution LC/MS data , 2009, Bioinform..

[56]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[57]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[58]  David S. Wishart,et al.  MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis , 2012, Nucleic Acids Res..

[59]  S. Neumann,et al.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. , 2012, Analytical chemistry.

[60]  Daniel Jacob,et al.  Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics , 2014, Bioinform..

[61]  Oliver Fiehn,et al.  Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics. , 2016, Analytical chemistry.

[62]  Shuzhao Li,et al.  Metabolic Phenotypes of Response to Vaccination in Humans , 2017, Cell.

[63]  Wanchang Lin,et al.  Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules' , 2009, BMC Bioinformatics.

[64]  Kristian Fog Nielsen,et al.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking , 2016, Nature Biotechnology.

[65]  Jing Gao,et al.  Metscape: a Cytoscape plug-in for visualizing and interpreting metabolomic data in the context of human metabolic networks , 2010, Bioinform..

[66]  Koichi Araki,et al.  Autophagy is essential for effector CD8 T cell survival and memory formation , 2014, Nature Immunology.

[67]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[68]  Oliver Fiehn,et al.  LipidBlast - in-silico tandem mass spectrometry database for lipid identification , 2013, Nature Methods.

[69]  David S. Wishart,et al.  MetaboAnalyst: a web server for metabolomic data analysis and interpretation , 2009, Nucleic Acids Res..

[70]  Gudmund R. Iversen,et al.  Analysis of Variance , 2011, International Encyclopedia of Statistical Science.

[71]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[72]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[73]  David S. Wishart,et al.  MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data , 2010, Nucleic Acids Res..

[74]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[75]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metatt: a Web-based Metabolomics Tool for Analyzing Time-series and Two-factor Datasets , 2022 .

[76]  Shuzhao Li,et al.  Detailed Mitochondrial Phenotyping by High Resolution Metabolomics , 2012, PloS one.

[77]  Shuzhao Li,et al.  Amino Acid Metabolism is Altered in Adolescents with Nonalcoholic Fatty Liver Disease-An Untargeted, High Resolution Metabolomics Study. , 2016, The Journal of pediatrics.

[78]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[79]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[80]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[81]  Karan Uppal,et al.  High-resolution metabolomics of occupational exposure to trichloroethylene , 2016, International journal of epidemiology.

[82]  Michelle F Clasquin,et al.  LC-MS data processing with MAVEN: a metabolomic analysis and visualization engine. , 2012, Current protocols in bioinformatics.

[83]  Fabian J Theis,et al.  Computational approaches for systems metabolomics. , 2016, Current opinion in biotechnology.

[84]  Douglas B. Kell,et al.  Proposed minimum reporting standards for data analysis in metabolomics , 2007, Metabolomics.

[85]  Raghuraj Rao,et al.  MetDAT: a modular and workflow-based free online pipeline for mass spectrometry data processing, analysis and interpretation , 2010, Bioinform..

[86]  Uwe Schmitt,et al.  eMZed: an open source framework in Python for rapid and interactive development of LC/MS data analysis workflows , 2013, Bioinform..

[87]  Dean P. Jones,et al.  High-performance metabolic profiling of plasma from seven mammalian species for simultaneous environmental chemical surveillance and bioeffect monitoring. , 2012, Toxicology.

[88]  Joseph M. Foster,et al.  LipidHome: A Database of Theoretical Lipids Optimized for High Throughput Mass Spectrometry Lipidomics , 2013, PloS one.

[89]  Kwanjeera Wanichthanarak,et al.  Metabox: A Toolbox for Metabolomic Data Analysis, Interpretation and Integrative Exploration , 2017, PloS one.

[90]  C. Böttcher,et al.  Metabolome Analysis of Arabidopsis thaliana Roots Identifies a Key Metabolic Pathway for Iron Acquisition , 2014, PloS one.

[91]  Vince D. Calhoun,et al.  Group sparse canonical correlation analysis for genomic data integration , 2013, BMC Bioinformatics.

[92]  Gabi Kastenmüller,et al.  metaP-Server: A Web-Based Metabolomics Data Analysis Tool , 2010, Journal of biomedicine & biotechnology.

[93]  Stephen P. Young,et al.  Pathomx: an interactive workflow-based tool for the analysis of metabolomic data , 2014, BMC Bioinformatics.

[94]  S. Lê,et al.  BMC Genomics BioMed Central Methodology article Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach , 2008 .

[95]  Karan Uppal,et al.  Metabolic pathways of lung inflammation revealed by high-resolution metabolomics (HRM) of H1N1 influenza virus infection in mice. , 2016, American journal of physiology. Regulatory, integrative and comparative physiology.

[96]  W. Pan,et al.  SMART: Statistical Metabolomics Analysis-An R Tool. , 2016, Analytical chemistry.

[97]  Adam B. Olshen,et al.  Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis , 2009, Bioinform..

[98]  Christian Gieger,et al.  Epigenetics meets metabolomics: an epigenome-wide association study with blood serum metabolic traits , 2013, Human molecular genetics.

[99]  D. Wishart,et al.  Translational biomarker discovery in clinical metabolomics: an introductory tutorial , 2012, Metabolomics.

[100]  Karan Uppal,et al.  Plasma Metabolomics in Human Pulmonary Tuberculosis Disease: A Pilot Study , 2014, PloS one.

[101]  Adam P. Arkin,et al.  Interactive XCMS Online: Simplifying Advanced Metabolomic Data Processing and Subsequent Statistical Analyses , 2014, Analytical chemistry.

[102]  Jinlian Wang,et al.  MetaboSearch: Tool for Mass-Based Metabolite Identification Using Multiple Databases , 2012, PloS one.

[103]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[104]  Timothy M. D. Ebbels,et al.  Bioinformatic methods in NMR-based metabolic profiling , 2009 .

[105]  Vasant R. Marur,et al.  Serum lipidomics profiling using LC-MS and high-energy collisional dissociation fragmentation: focus on triglyceride detection and characterization. , 2011, Analytical chemistry.

[106]  Shuzhao Li,et al.  Computational Metabolomics: A Framework for the Million Metabolome , 2022 .

[107]  Michael P. Barrett,et al.  MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks , 2010, Nucleic Acids Res..

[108]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[109]  Trey Ideker,et al.  Systems biology guided by XCMS Online metabolomics , 2017, Nature Methods.

[110]  Patrick J. Brennan,et al.  Serum Metabolomics Reveals Higher Levels of Polyunsaturated Fatty Acids in Lepromatous Leprosy: Potential Markers for Susceptibility and Pathogenesis , 2011, PLoS neglected tropical diseases.

[111]  Aedín C. Culhane,et al.  Dimension reduction techniques for the integrative analysis of multi-omics data , 2016, Briefings Bioinform..

[112]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[113]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..