Improving visualisation and interpretation of metabolome-wide association studies (MWAS): an application in a population based cohort using untargeted 1H

: 1 H NMR spectroscopy of bio fl uids generates reproducible data allowing detection and quanti fi cation of small molecules in large population cohorts. Statistical models to analyze such data are now well-established, and the use of univariate metabolome wide association studies (MWAS) investigating the spectral features separately has emerged as a computationally e ffi cient and interpretable alternative to multivariate models. The MWAS rely on the accurate estimation of a metabolome wide signi fi cance level (MWSL) to be applied to control the family wise error rate. Subsequent interpretation requires e ffi cient visualization and formal feature annotation, which, in-turn, call for e ffi cient prioritization of spectral variables of interest. Using human serum 1 H NMR spectroscopic pro fi les from 3948 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), we have performed a series of MWAS for serum levels of glucose. We fi rst propose an extension of the conventional MWSL that yields stable estimates of the MWSL across the di ff erent model parameterizations and distributional features of the outcome. We propose both e ffi cient visualization methods and a strategy based on subsampling and internal validation to prioritize the associations. Our work proposes and illustrates practical and scalable solutions to facilitate the implementation of the MWAS approach and improve interpretation in large cohort studies.

[1]  P. Elliott,et al.  Improving Visualization and Interpretation of Metabolome-Wide Association Studies: An Application in a Population-Based Cohort Using Untargeted 1H NMR Metabolic Profiling , 2017, Journal of proteome research.

[2]  Marie Loh,et al.  Workflow for Integrated Processing of Multicohort Untargeted 1H NMR Metabolomics Data in Large-Scale Metabolic Epidemiology. , 2016, Journal of proteome research.

[3]  Christian Gieger,et al.  A Metabolome-Wide Association Study of Kidney Function and Disease in the General Population. , 2016, Journal of the American Society of Nephrology : JASN.

[4]  U. Sauer,et al.  Biological insights through nontargeted metabolomics. , 2015, Current opinion in biotechnology.

[5]  M. Spraul,et al.  Precision high-throughput proton NMR spectroscopy of human urine, serum, and plasma for large-scale metabolic phenotyping. , 2014, Analytical chemistry.

[6]  T. Lehtimäki,et al.  A metabolic view on menopause and ageing , 2014, Nature Communications.

[7]  John P A Ioannidis,et al.  Design and analysis of metabolomics studies in epidemiologic research: a primer on -omic technologies. , 2014, American journal of epidemiology.

[8]  C. Gieger,et al.  Metabolomics approach reveals effects of antihypertensives and lipid-lowering drugs on the human metabolism , 2014, European Journal of Epidemiology.

[9]  E. J. van den Oord,et al.  Behavioral metabolomics analysis identifies novel neurochemical signatures in methamphetamine sensitization , 2013, Genes, brain, and behavior.

[10]  Vincent Navratil,et al.  SRV: an open-source toolbox to accelerate the recovery of metabolic biomarkers and correlations from metabolic phenotyping datasets , 2013, Bioinform..

[11]  Julian L Griffin,et al.  A practical guide to metabolomic profiling as a discovery tool for human heart disease. , 2013, Journal of molecular and cellular cardiology.

[12]  Terence P Speed,et al.  Statistical analysis of metabolomics data. , 2013, Methods in molecular biology.

[13]  Maria De Iorio,et al.  Subset optimization by reference matching (STORM): an optimized statistical approach for recovery of metabolic biomarker structural information from 1H NMR spectra of biofluids. , 2012, Analytical chemistry.

[14]  T. Speed,et al.  Normalizing and integrating metabolomics data. , 2012, Analytical chemistry.

[15]  Ara W. Darzi,et al.  Metabolic phenotyping in clinical and surgical environments , 2012, Nature.

[16]  Jiexin Zhang,et al.  Sources of variation in false discovery rate estimation include sample size, correlation, and inherent differences between groups , 2012, BMC Bioinformatics.

[17]  Alexander Raskind,et al.  Statistical methods in metabolomics. , 2012, Methods in molecular biology.

[18]  J. Lindon,et al.  A metabolic system-wide characterisation of the pig: a model for human physiology. , 2011, Molecular bioSystems.

[19]  John C Lindon,et al.  Processing and modeling of nuclear magnetic resonance (NMR) metabolic profiles. , 2011, Methods in molecular biology.

[20]  Marc Chadeau-Hyam,et al.  Metabolic profiling and the metabolome-wide association study: significance level for biomarker identification. , 2010, Journal of proteome research.

[21]  Jeremiah Stamler,et al.  Opening up the "Black Box": metabolic phenotyping and metabolome-wide association studies in epidemiology. , 2010, Journal of clinical epidemiology.

[22]  Johan Trygg,et al.  Chemometrics in metabolomics--a review in human disease diagnosis. , 2010, Analytica chimica acta.

[23]  T. Ebbels,et al.  Recursive segment-wise peak alignment of biological (1)h NMR spectra for improved metabolic biomarker recovery. , 2009, Analytical chemistry.

[24]  Ian D. Wilson,et al.  Metabolic Phenotyping in Health and Disease , 2008, Cell.

[25]  Age K Smilde,et al.  Multilevel data analysis of a crossover designed human nutritional intervention study. , 2008, Journal of proteome research.

[26]  John C. Lindon,et al.  Analytical technologies for metabonomics and metabolomics, and multi-omic information recovery , 2008 .

[27]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[28]  Ian D Wilson,et al.  Analytical strategies in metabonomics. , 2007, Journal of proteome research.

[29]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..

[30]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[31]  J. Nicholson Global systems biology, personalized medicine and molecular epidemiology , 2006, Molecular systems biology.

[32]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[33]  P. Elliott,et al.  Assessment of analytical reproducibility of 1H NMR spectroscopy based metabonomics for large-scale epidemiological research: the INTERMAP Study. , 2006, Analytical chemistry.

[34]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[35]  D. Gauguier,et al.  Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. , 2005, Analytical chemistry.

[36]  R. Kronmal,et al.  Multi-Ethnic Study of Atherosclerosis: objectives and design. , 2002, American journal of epidemiology.

[37]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[38]  R. Levy,et al.  Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. , 1972, Clinical chemistry.