Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis

MetaboAnalyst (https://www.metaboanalyst.ca) is an easy‐to‐use web‐based tool suite for comprehensive metabolomic data analysis, interpretation, and integration with other omics data. Since its first release in 2009, MetaboAnalyst has evolved significantly to meet the ever‐expanding bioinformatics demands from the rapidly growing metabolomics community. In addition to providing a variety of data processing and normalization procedures, MetaboAnalyst supports a wide array of functions for statistical, functional, as well as data visualization tasks. Some of the most widely used approaches include PCA (principal component analysis), PLS‐DA (partial least squares discriminant analysis), clustering analysis and visualization, MSEA (metabolite set enrichment analysis), MetPA (metabolic pathway analysis), biomarker selection via ROC (receiver operating characteristic) curve analysis, as well as time series and power analysis. The current version of MetaboAnalyst (4.0) features a complete overhaul of the user interface and significantly expanded underlying knowledge bases (compound database, pathway libraries, and metabolite sets). Three new modules have been added to support pathway activity prediction directly from mass peaks, biomarker meta‐analysis, and network‐based multi‐omics data integration. To enable more transparent and reproducible analysis of metabolomic data, we have released a companion R package (MetaboAnalystR) to complement the web‐based application. This article provides an overview of the main functional modules and the general workflow of MetaboAnalyst 4.0, followed by 12 detailed protocols: © 2019 by John Wiley & Sons, Inc.

[1]  Junwei Han,et al.  Global Prioritization of Disease Candidate Metabolites Based on a Multi-omics Composite Network , 2015, Scientific Reports.

[2]  Alexander Goesmann,et al.  MeltDB 2.0–advances of the metabolomics software system , 2013, Bioinform..

[3]  J. Lindon,et al.  Scaling and normalization effects in NMR spectroscopic metabonomic data sets. , 2006, Analytical chemistry.

[4]  Jacob E. Wulff,et al.  Characterization of the biochemical variability of bovine milk using metabolomics , 2009, Metabolomics.

[5]  William Stafford Noble,et al.  Analysis of strain and regional variation in gene expression in mouse brain , 2001, Genome Biology.

[6]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[7]  Georg F. Weiller,et al.  PathExpress: a web-based tool to identify relevant pathways in gene expression data , 2007, Nucleic Acids Res..

[8]  A. Butte,et al.  The Integrative Human Microbiome Project: Dynamic Analysis of Microbiome-Host Omics Profiles during Periods of Human Health and Disease , 2014, Cell host & microbe.

[9]  Age K. Smilde,et al.  UvA-DARE ( Digital Academic Repository ) Assessment of PLSDA cross validation , 2008 .

[10]  David S. Wishart,et al.  Learning to predict cancer-associated skeletal muscle wasting from 1H-NMR profiles of urinary metabolites , 2011, Metabolomics.

[11]  David S. Wishart,et al.  Metabolomics reveals unhealthy alterations in rumen metabolism with increased proportion of cereal grain in the diet of dairy cows , 2010, Metabolomics.

[12]  Ann M. Hess,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Filtering for increased power for microarray data analysis , 2008 .

[13]  Jianguo Xia,et al.  MetaboAnalystR 2.0: From Raw Spectra to Biological Insights , 2019, Metabolites.

[14]  A. Harvey Millar,et al.  The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets , 2010, BMC Bioinformatics.

[15]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[16]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[17]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metpa: a Web-based Metabolomics Tool for Pathway Analysis and Visualization , 2022 .

[18]  Masanori Arita,et al.  MS-DIAL: Data Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis , 2015, Nature Methods.

[19]  Shuzhao Li,et al.  Predicting Network Activity from High Throughput Metabolomics , 2013, PLoS Comput. Biol..

[20]  Joachim Selbig,et al.  pcaMethods - a bioconductor package providing PCA methods for incomplete data , 2007, Bioinform..

[21]  David S. Wishart,et al.  Accurate, Fully-Automated NMR Spectral Profiling for Metabolomics , 2014, PloS one.

[22]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[24]  K. Iwaisako,et al.  Comprehensive analysis of transcriptome and metabolome analysis in Intrahepatic Cholangiocarcinoma and Hepatocellular Carcinoma , 2015, Scientific Reports.

[25]  S. Neumann,et al.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. , 2012, Analytical chemistry.

[26]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[27]  S. Deming,et al.  Chemometrics: an overview. , 1986, Clinical chemistry.

[28]  David S. Wishart,et al.  MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis , 2012, Nucleic Acids Res..

[29]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[30]  Trey Ideker,et al.  Systems biology guided by XCMS Online metabolomics , 2017, Nature Methods.

[31]  R. Breitling,et al.  PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. , 2011, Analytical chemistry.

[32]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metatt: a Web-based Metabolomics Tool for Analyzing Time-series and Two-factor Datasets , 2022 .

[33]  K. Reinert,et al.  OpenMS: a flexible open-source software platform for mass spectrometry data analysis , 2016, Nature Methods.

[34]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[35]  Ulrich Mansmann,et al.  GlobalANCOVA: exploration and assessment of gene group effects , 2008, Bioinform..

[36]  D. Wishart,et al.  Translational biomarker discovery in clinical metabolomics: an introductory tutorial , 2012, Metabolomics.

[37]  David S. Wishart,et al.  MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis , 2018, Nucleic Acids Res..

[38]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[39]  A. Smilde,et al.  Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation. , 2006, Analytical chemistry.

[40]  Hadley Wickham,et al.  Tools to Make Developing R Packages Easier , 2016 .

[41]  Christian von Mering,et al.  STITCH: interaction networks of chemicals and proteins , 2007, Nucleic Acids Res..

[42]  G. Siuzdak,et al.  XCMS Online: a web-based platform to process untargeted metabolomic data. , 2012, Analytical chemistry.

[43]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  David S. Wishart,et al.  MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data , 2010, Nucleic Acids Res..

[46]  M. van Iterson,et al.  Relative power and sample size analysis on gene expression profiling data , 2009, BMC Genomics.

[47]  Arjen Lommen,et al.  MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware , 2011, Metabolomics.

[48]  David S. Wishart,et al.  SMPDB: The Small Molecule Pathway Database , 2009, Nucleic Acids Res..

[49]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[50]  A. Saghatelian,et al.  Assignment of endogenous substrates to enzymes by global metabolite profiling. , 2004, Biochemistry.

[51]  O. Fiehn Metabolomics – the link between genotypes and phenotypes , 2004, Plant Molecular Biology.

[52]  Oliver Fiehn,et al.  Investigation of Metabolomic Blood Biomarkers for Detection of Adenocarcinoma Lung Cancer , 2015, Cancer Epidemiology, Biomarkers & Prevention.

[53]  Jasmine Chong,et al.  MetaboAnalystR: an R package for flexible and reproducible analysis of metabolomics data , 2018, Bioinform..

[54]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[55]  K. Siamopoulos,et al.  Evaluation of tubulointerstitial lesions' severity in patients with glomerulonephritides: an NMR-based metabonomic study. , 2007, Journal of proteome research.

[56]  Burkhard Morgenstern,et al.  Metabolite-based clustering and visualization of mass spectrometry data using one-dimensional self-organizing maps , 2008, Algorithms for Molecular Biology.

[57]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[58]  David S. Wishart,et al.  MetaboAnalyst: a web server for metabolomic data analysis and interpretation , 2009, Nucleic Acids Res..

[59]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[60]  A. Smilde,et al.  How to distinguish healthy from diseased? Classification strategy for mass spectrometry‐based clinical proteomics , 2007, Proteomics.

[61]  T. Speed,et al.  A multivariate empirical Bayes statistic for replicated microarray time course data , 2006, math/0702685.

[62]  David S. Wishart,et al.  MetaboAnalyst 3.0—making metabolomics more meaningful , 2015, Nucleic Acids Res..