Accounting for biological variation with linear mixed-effects modelling improves the quality of clinical metabolomics data

Metabolite profiles from biological samples suffer from both technical variations and subject-specific variants. To improve the quality of metabolomics data, conventional data processing methods can be employed to remove technical variations. These methods do not consider sources of subject variation as separate factors from biological factors of interest. This can be a significant issue when performing quantitative metabolomics in clinical trials or screening for a potential biomarker in early-stage disease, because changes in metabolism or a desired-metabolite signal are small compared to the total metabolite signals. As a result, inter-individual variability can interfere subsequent statistical analyses. Here, we propose an additional data processing step using linear mixed-effects modelling to readjust an individual metabolite signal prior to multivariate analyses. Published clinical metabolomics data was used to demonstrate and evaluate the proposed method. We observed a substantial reduction in variation of each metabolite signal after model fitting. A comparison with other strategies showed that our proposed method contributed to improved classification accuracy, precision, sensitivity and specificity. Moreover, we highlight the importance of patient metadata as it contains rich information of subject characteristics, which can be used to model and normalize metabolite abundances. The proposed method is available as an R package lmm2met.

[1]  Kim-Anh Lê Cao,et al.  A novel approach for biomarker selection and the integration of repeated measures experiments from two assays , 2012, BMC bioinformatics.

[2]  Matej Oresic,et al.  Normalization method for metabolomics data using optimal selection of multiple internal standards , 2007, BMC Bioinformatics.

[3]  Intawat Nookaew,et al.  Fast and accurate preparation fatty acid methyl esters by microwave-assisted derivatization in the yeast Saccharomyces cerevisiae , 2012, Applied Microbiology and Biotechnology.

[4]  Johann A. Gagnon-Bartsch,et al.  Statistical methods for handling unwanted variation in metabolomics data. , 2015, Analytical chemistry.

[5]  Kwanjeera Wanichthanarak,et al.  Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine , 2018, Omics : a journal of integrative biology.

[6]  M. Giera,et al.  Analytical pitfalls and challenges in clinical metabolomics. , 2016, Bioanalysis.

[7]  Age K. Smilde,et al.  Multivariate paired data analysis: multilevel PLSDA versus OPLSDA , 2009, Metabolomics.

[8]  Wei Zheng,et al.  Human metabolic correlates of body mass index , 2013, Metabolomics.

[9]  Kim Ekroos,et al.  Gender, Contraceptives and Individual Metabolic Predisposition Shape a Healthy Plasma Lipidome , 2016, Scientific Reports.

[10]  T. Speed,et al.  Normalizing and integrating metabolomics data. , 2012, Analytical chemistry.

[11]  C. Kuo,et al.  Batch Normalizer: a fast total abundance regression calibration method to simultaneously adjust batch and injection order effects in liquid chromatography/time-of-flight mass spectrometry-based metabolomics data and comparison with current calibration methods. , 2013, Analytical chemistry.

[12]  Kim-Anh Lê Cao,et al.  mixOmics: An R package for ‘omics feature selection and multiple data integration , 2017, bioRxiv.

[13]  Keith R Solomon,et al.  Cholesterol and prostate cancer. , 2012, Current opinion in pharmacology.

[14]  J. Nielsen,et al.  Expanded metabolite coverage of Saccharomyces cerevisiae extract through improved chloroform/methanol extraction and tert-butyldimethylsilyl derivatization , 2015 .

[15]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[16]  Jürgen Böhm,et al.  Metabolomics and transcriptomics identify pathway differences between visceral and subcutaneous adipose tissue in colorectal cancer patients: the ColoCare study. , 2015, The American journal of clinical nutrition.

[17]  Douglas W Mahoney,et al.  Linear mixed effects models. , 2007, Methods in molecular biology.

[18]  Peter C. Thomson,et al.  Analysis of gene expression data using a linear mixed model/finite mixture model approach: application to regional differences in the human brain , 2014, Bioinform..

[19]  G. Giannelli,et al.  Differential Tissue Fatty Acids Profiling between Colorectal Cancer Patients with and without Synchronous Metastasis , 2018, International journal of molecular sciences.

[20]  G. Raber,et al.  Identification and characterization of fish oil supplements based on fatty acid analysis combined with a hierarchical clustering algorithm , 2014 .

[21]  R. Sakurai,et al.  Effects of aging on serum levels of lipid molecular species as determined by lipidomics analysis in Japanese men and women , 2018, Lipids in Health and Disease.

[22]  Caroline H. Johnson,et al.  Metabolomics: beyond biomarkers and towards mechanisms , 2016, Nature Reviews Molecular Cell Biology.

[23]  David I. Ellis,et al.  A tutorial review: Metabolomics and partial least squares-discriminant analysis--a marriage of convenience or a shotgun wedding. , 2015, Analytica chimica acta.

[24]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[25]  F. Sotgia,et al.  Tumorigenesis and Neoplastic Progression Role of Cholesterol in the Development and Progression of Breast Cancer , 2010 .

[26]  Chunpeng Zhu,et al.  Cholesterol Enhances Colorectal Cancer Progression via ROS Elevation and MAPK Signaling Pathway Activation , 2017, Cellular Physiology and Biochemistry.

[27]  E. Thévenot,et al.  Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses. , 2015, Journal of proteome research.

[28]  B. Huang,et al.  A Linear Mixed Model Spline Framework for Analysing Time Course ‘Omics’ Data , 2015, PloS one.

[29]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..

[30]  C. Gieger,et al.  Human serum metabolic profiles are age dependent , 2012, Aging cell.

[31]  D. Wishart Emerging applications of metabolomics in drug discovery and precision medicine , 2016, Nature Reviews Drug Discovery.

[32]  Ian D. Wilson,et al.  Metabolic Phenotyping in Health and Disease , 2008, Cell.

[33]  J. Hardwick,et al.  Cholesterol metabolism and colorectal cancers. , 2012, Current opinion in pharmacology.

[34]  Åsa M Wheelock,et al.  Trials and tribulations of 'omics data analysis: assessing quality of SIMCA-based multivariate models using examples from pulmonary medicine. , 2013, Molecular bioSystems.

[35]  T. Bathen,et al.  Impact of Freezing Delay Time on Tissue Samples for Metabolomic Studies , 2016, Front. Oncol..

[36]  Eoin Fahy,et al.  Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools , 2015, Nucleic Acids Res..

[37]  Björn Egert,et al.  Metabolite patterns predicting sex and age in participants of the Karlsruhe Metabolomics and Nutrition (KarMeN) study , 2017, PloS one.

[38]  O. Fiehn Metabolomics – the link between genotypes and phenotypes , 2004, Plant Molecular Biology.

[39]  I. Nookaew,et al.  Metabolomics and Integrative Omics for the Development of Thai Traditional Medicine , 2017, Front. Pharmacol..

[40]  M. Freeman,et al.  Cholesterol and prostate cancer , 2004, Journal of cellular biochemistry.

[41]  Oliver Fiehn,et al.  Metabolomic Markers of Altered Nucleotide Metabolism in Early Stage Adenocarcinoma , 2015, Cancer Prevention Research.

[42]  J. Shaw,et al.  Plasma lipid profiling in a large population-based cohort[S] , 2013, Journal of Lipid Research.

[43]  A. Moschen,et al.  The Arachidonic Acid Metabolome Serves as a Conserved Regulator of Cholesterol Metabolism , 2014, Cell metabolism.

[44]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[45]  Jens Nielsen,et al.  Improved quantification of farnesene during microbial production from Saccharomyces cerevisiae in two-liquid-phase fermentations. , 2016, Talanta.

[46]  Marcel J. T. Reinders,et al.  SPiCE: a web-based tool for sequence-based protein classification and exploration , 2014, BMC Bioinformatics.