Bioinformatics tools for the analysis of NMR metabolomics studies focused on the identification of clinically relevant biomarkers

Metabolomics, a systems biology approach focused on the global study of the metabolome, offers a tremendous potential in the analysis of clinical samples. Among other applications, metabolomics enables mapping of biochemical alterations involved in the pathogenesis of diseases, and offers the opportunity to noninvasively identify diagnostic, prognostic and predictive biomarkers that could translate into early therapeutic interventions. Particularly, metabolomics by Nuclear Magnetic Resonance (NMR) has the ability to simultaneously detect and structurally characterize an abundance of metabolic components, even when their identities are unknown. Analysis of the data generated using this experimental approach requires the application of statistical and bioinformatics tools for the correct interpretation of the results. This review focuses on the different steps involved in the metabolomics characterization of biofluids for clinical applications, ranging from the design of the study to the biological interpretation of the results. Particular emphasis is devoted to the specific procedures required for the processing and interpretation of NMR data with a focus on the identification of clinically relevant biomarkers.

[1]  D. Gauguier,et al.  Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. , 2005, Analytical chemistry.

[2]  Jaeyun Sung,et al.  Molecular signatures from omics data: From chaos to consensus , 2012, Biotechnology journal.

[3]  Arnald Alonso,et al.  Analytical Methods in Untargeted Metabolomics: State of the Art in 2015 , 2015, Front. Bioeng. Biotechnol..

[4]  Richard G. Brereton,et al.  Chemometrics for Pattern Recognition , 2009 .

[5]  F Savorani,et al.  icoshift: A versatile tool for the rapid alignment of 1D NMR spectra. , 2010, Journal of magnetic resonance.

[6]  Steven A Carr,et al.  Protein biomarker discovery and validation: the long and uncertain path to clinical utility , 2006, Nature Biotechnology.

[7]  Therese E. Malliavin,et al.  An NMR assignment module implemented in the Gifa NMR processing program , 1998, Bioinform..

[8]  Andrew E. Jaffe,et al.  Bioinformatics Applications Note Gene Expression the Sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments , 2022 .

[9]  Terry E. Weymouth,et al.  MetDisease - connecting metabolites to diseases via literature , 2014, Bioinform..

[10]  Christian Ludwig,et al.  MetaboLab - advanced NMR data processing and analysis for metabolomics , 2011, BMC Bioinformatics.

[11]  Manuel Desco,et al.  A novel R-package graphic user interface for the analysis of metabonomic profiles , 2009, BMC Bioinformatics.

[12]  B. Blaise,et al.  Data-driven sample size determination for metabolic phenotyping studies. , 2013, Analytical chemistry.

[13]  Christian Baumgartner,et al.  Bioinformatic-driven search for metabolic biomarkers in disease , 2011, Journal of Clinical Bioinformatics.

[14]  Isobel Claire Gormley,et al.  MetSizeR: selecting the optimal sample size for metabolomic studies using an analysis based approach , 2013, BMC Bioinformatics.

[15]  Erin E. Carlson,et al.  Targeted profiling: quantitative analysis of 1H NMR metabolomics data. , 2006, Analytical chemistry.

[16]  J. Lindon,et al.  Scaling and normalization effects in NMR spectroscopic metabonomic data sets. , 2006, Analytical chemistry.

[17]  Age K. Smilde,et al.  UvA-DARE ( Digital Academic Repository ) Assessment of PLSDA cross validation , 2008 .

[18]  Weijun Luo,et al.  Pathview: an R/Bioconductor package for pathway-based data integration and visualization , 2013, Bioinform..

[19]  Robert Powers,et al.  MVAPACK: A Complete Data Handling Package for NMR Metabolomics , 2014, ACS chemical biology.

[20]  Bart Goethals,et al.  An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data , 2011, BMC Bioinformatics.

[21]  John P A Ioannidis,et al.  Design and analysis of metabolomics studies in epidemiologic research: a primer on -omic technologies. , 2014, American journal of epidemiology.

[22]  Silas Granato Villas-Bôas,et al.  Pathway Activity Profiling (PAPi): from the metabolite profile to the metabolic pathway activity , 2010, Bioinform..

[23]  Giovanni Scardoni,et al.  Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data , 2012, Bioinform..

[24]  Shunji Takahashi,et al.  Clinical Implementation of Metabolomics , 2012 .

[25]  John L. Markley,et al.  Deconvolution of Two-Dimensional NMR Spectra by Fast Maximum Likelihood Reconstruction: Application to Quantitative Metabolomics , 2011, Analytical chemistry.

[26]  Bernhard Pfeifer,et al.  A new data mining approach for profiling and categorizing kinetic patterns of metabolic biomarkers after myocardial injury , 2010, Bioinform..

[27]  Xavier Correig,et al.  Focus: a robust workflow for one-dimensional NMR spectral analysis. , 2014, Analytical chemistry.

[28]  R. Spang,et al.  State-of-the art data normalization methods improve NMR-based metabolomic analysis , 2011, Metabolomics.

[29]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[30]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[31]  A. Zhang,et al.  Serum metabolomics as a novel diagnostic approach for disease: a systematic review , 2012, Analytical and Bioanalytical Chemistry.

[32]  David S. Wishart,et al.  MetaboMiner – semi-automated identification of metabolites from 2D NMR spectra of complex biofluids , 2008, BMC Bioinformatics.

[33]  John Eng,et al.  Sample size estimation: a glimpse beyond simple formulas. , 2004, Radiology.

[34]  S. Wijmenga,et al.  NMR and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review. , 2012, Analytica chimica acta.

[35]  R. Goodacre,et al.  The role of metabolites and metabolomics in clinically applicable biomarkers of disease , 2010, Archives of Toxicology.

[36]  D. Raftery,et al.  Metabolomics-based methods for early disease diagnostics , 2008, Expert review of molecular diagnostics.

[37]  Maria De Iorio,et al.  Bayesian deconvolution and quantification of metabolites in complex 1D NMR spectra using BATMAN , 2014, Nature Protocols.

[38]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[39]  Michael Eiden,et al.  Getting the right answers: understanding metabolomics challenges , 2015, Expert review of molecular diagnostics.

[40]  Reinhard Laubenbacher,et al.  Bioinformatics tools for cancer metabolomics , 2011, Metabolomics.

[41]  Cheng Zheng,et al.  Identification and quantification of metabolites in 1H NMR spectra by Bayesian model selection , 2011, Bioinform..

[42]  Timothy M. D. Ebbels,et al.  Bioinformatic methods in NMR-based metabolic profiling , 2009 .

[43]  Nuno Bandeira,et al.  False discovery rates in spectral identification , 2012, BMC Bioinformatics.

[44]  J. Garcia-conde,et al.  Serum metabolome analysis by 1H-NMR reveals differences between chronic lymphocytic leukaemia molecular subgroups , 2010, Leukemia.

[45]  R. J. O. Torgrip,et al.  A note on normalization of biofluid 1D 1H-NMR data , 2008, Metabolomics.

[46]  David S. Wishart,et al.  Accurate, Fully-Automated NMR Spectral Profiling for Metabolomics , 2014, PloS one.

[47]  Mark A. van de Wiel,et al.  General power and sample size calculations for high-dimensional genomic data , 2013, Statistical applications in genetics and molecular biology.

[48]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[49]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[50]  Augustin Scalbert,et al.  The complex links between dietary phytochemicals and human health deciphered by metabolomics. , 2009, Molecular nutrition & food research.

[51]  S. Grzesiek,et al.  NMRPipe: A multidimensional spectral processing system based on UNIX pipes , 1995, Journal of biomolecular NMR.

[52]  Dimitrios Spiliotopoulos,et al.  muma, An R Package for Metabolomics Univariate and Multivariate Statistical Analysis , 2013 .

[53]  Kazuo Shinozaki,et al.  Statistical indices for simultaneous large-scale metabolite detections for a single NMR spectrum. , 2010, Analytical chemistry.

[54]  M. Rantalainen,et al.  OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification , 2006 .

[55]  David S. Wishart,et al.  MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data , 2010, Nucleic Acids Res..

[56]  D. Kell,et al.  Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. , 2004, BioEssays : news and reviews in molecular, cellular and developmental biology.

[57]  Woonghee Lee,et al.  NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy , 2014, Bioinform..

[58]  T. Ebbels,et al.  Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts , 2007, Nature Protocols.

[59]  V. Navratil,et al.  Orthogonal filtered recoupled-STOCSY to extract metabolic networks associated with minor perturbations from NMR spectroscopy. , 2011, Journal of proteome research.

[60]  J. Markley,et al.  rNMR: open source software for identifying and quantifying metabolites in NMR spectra , 2009, Magnetic resonance in chemistry : MRC.

[61]  Minoru Kanehisa,et al.  The KEGG database. , 2002, Novartis Foundation symposium.

[62]  Daniel Raftery,et al.  Ratio analysis nuclear magnetic resonance spectroscopy for selective metabolite identification in complex samples. , 2011, Analytical chemistry.

[63]  John C Lindon,et al.  Processing and modeling of nuclear magnetic resonance (NMR) metabolic profiles. , 2011, Methods in molecular biology.

[64]  Dan C. Tulpan,et al.  MetaboHunter: an automatic approach for identification of metabolites from 1H-NMR spectra of complex mixtures , 2011, BMC Bioinformatics.

[65]  R. McKay How the 1D‐NOESY suppresses solvent signal in metabonomics NMR spectroscopy: An examination of the pulse sequence components and evolution , 2011 .

[66]  Alberto Ferrer,et al.  Chemometric approaches to improve PLSDA model outcome for predicting human non-alcoholic fatty liver disease using UPLC-MS as a metabolic profiling tool , 2011, Metabolomics.

[67]  D. Hoult Solvent peak saturation with single phase and quadrature fourier transformation , 1976 .

[68]  David S. Wishart,et al.  Bioinformatics Applications Note Systems Biology Metpa: a Web-based Metabolomics Tool for Pathway Analysis and Visualization , 2022 .

[69]  Ralf Herwig,et al.  The ConsensusPathDB interaction database: 2013 update , 2012, Nucleic Acids Res..

[70]  David S. Wishart,et al.  MetaboAnalyst 3.0—making metabolomics more meaningful , 2015, Nucleic Acids Res..

[71]  Jean-Baptiste Cazier,et al.  mQTL.NMR: an integrated suite for genetic mapping of quantitative variations of (1)H NMR-based metabolic profiles. , 2015, Analytical chemistry.

[72]  Tao Wang,et al.  Automics: an integrated platform for NMR-based metabonomics spectral processing and data analysis , 2009, BMC Bioinformatics.

[73]  Kurt Hornik,et al.  The Comprehensive R Archive Network , 2012 .

[74]  Wolfram Gronwald,et al.  MetaboQuant: a tool combining individual peak calibration and outlier detection for accurate metabolite quantification in 1D (1)H and (1)H-(13)C HSQC NMR spectra. , 2013, BioTechniques.

[75]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[76]  Edward R. Dougherty,et al.  Small Sample Issues for Microarray-Based Classification , 2001, Comparative and functional genomics.

[77]  C. Arkin,et al.  How many patients are necessary to assess test performance? , 1990, JAMA.

[78]  D. Wishart,et al.  Translational biomarker discovery in clinical metabolomics: an introductory tutorial , 2012, Metabolomics.

[79]  Oliver Fiehn,et al.  MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity , 2012, BMC Bioinformatics.

[80]  Neil MacKinnon,et al.  MetaboID: a graphical user interface package for assignment of 1H NMR spectra of bodyfluids and tissues. , 2013, Journal of magnetic resonance.

[81]  John C Lindon,et al.  Pharmacometabonomic identification of a significant host-microbiome metabolic interaction affecting human drug metabolism , 2009, Proceedings of the National Academy of Sciences.

[82]  E. K. Kemsley,et al.  Discriminant analysis of high-dimensional data: a comparison of principal components analysis and partial least squares data reduction methods , 1996 .

[83]  Yi-Zeng Liang,et al.  Monte Carlo cross‐validation for selecting a model and estimating the prediction error in multivariate calibration , 2004 .

[84]  Oliver F Bathe,et al.  Metabolomics and surgical oncology: Potential role for small molecule biomarkers , 2011, Journal of surgical oncology.

[85]  A. Fontana,et al.  Composition and Quantitation of Microalgal Lipids by ERETIC 1H NMR Method , 2013, Marine drugs.

[86]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[87]  John C. Lindon,et al.  The handbook of metabonomics and metabolomics , 2007 .

[88]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[89]  David J. States,et al.  Bioinformatics Applications Note Databases and Ontologies Metab2mesh: Annotating Compounds with Medical Subject Headings , 2022 .

[90]  F. Prósper,et al.  Multiple Myeloma Patients Have a Specific Serum Metabolomic Profile That Changes after Achieving Complete Remission , 2013, Clinical Cancer Research.

[91]  Simon Tavaré,et al.  Normalization of metabolomics data with applications to correlation maps , 2014, Bioinform..

[92]  Vincent Navratil,et al.  Sample size calculation in metabolic phenotyping studies , 2015, Briefings Bioinform..

[93]  Qi Zhao,et al.  HiRes - a tool for comprehensive assessment and interpretation of metabolomic data , 2006, Bioinform..

[94]  John C Lindon,et al.  Pharmacometabonomics as an effector for personalized medicine. , 2011, Pharmacogenomics.

[95]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[96]  B. De Moor,et al.  Biomarkers of endometriosis. , 2013, Fertility and sterility.

[97]  Mark R. Viant,et al.  Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation , 2007, BMC Bioinformatics.

[98]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[99]  Juan Carlos Cobas,et al.  Nuclear magnetic resonance data processing. MestRe‐C: A software package for desktop computers , 2003 .

[100]  Hui Sun,et al.  Urine Metabolomics Analysis for Biomarker Discovery and Detection of Jaundice Syndrome in Patients With Liver Disease* , 2012, Molecular & Cellular Proteomics.

[101]  Jacco D. van Beek,et al.  matNMR: A flexible toolbox for processing, analyzing and visualizing magnetic resonance data in Matlab® , 2007 .

[102]  Wei Zheng,et al.  Metabolomics in Epidemiology: Sources of Variability in Metabolite Measurements and Implications , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[103]  M. van Iterson,et al.  Relative power and sample size analysis on gene expression profiling data , 2009, BMC Genomics.