A network-based conditional genetic association analysis of the human metabolome

Abstract Background Genome-wide association studies have identified hundreds of loci that influence a wide variety of complex human traits; however, little is known regarding the biological mechanism of action of these loci. The recent accumulation of functional genomics (“omics”), including metabolomics data, has created new opportunities for studying the functional role of specific changes in the genome. Functional genomic data are characterized by their high dimensionality, the presence of (strong) statistical dependency between traits, and, potentially, complex genetic control. Therefore, the analysis of such data requires specific statistical genetics methods. Results To facilitate our understanding of the genetic control of omics phenotypes, we propose a trait-centered, network-based conditional genetic association (cGAS) approach for identifying the direct effects of genetic variants on omics-based traits. For each trait of interest, we selected from a biological network a set of other traits to be used as covariates in the cGAS. The network can be reconstructed either from biological pathway databases (a mechanistic approach) or directly from the data, using a Gaussian graphical model applied to the metabolome (a data-driven approach). We derived mathematical expressions that allow comparison of the power of univariate analyses with conditional genetic association analyses. We then tested our approach using data from a population-based Cooperative Health Research in the region of Augsburg (KORA) study (n = 1,784 subjects, 1.7 million single-nucleotide polymorphisms) with measured data for 151 metabolites. Conclusions We found that compared to single-trait analysis, performing a genetic association analysis that includes biologically relevant covariates can either gain or lose power, depending on specific pleiotropic scenarios, for which we provide empirical examples. In the context of analyzed metabolomics data, the mechanistic network approach had more power compared to the data-driven approach. Nevertheless, we believe that our analysis shows that neither a prior-knowledge-only approach nor a phenotypic-data-only approach is optimal, and we discuss possibilities for improvement.

[1]  Seongho Kim ppcor: An R Package for a Fast Calculation to Semi-partial Correlation Coefficients. , 2015, Communications for statistical applications and methods.

[2]  T Mark Beasley,et al.  Rank-Based Inverse Normal Transformations are Increasingly Used, But are They Merited? , 2009, Behavior genetics.

[3]  Fabian J. Theis,et al.  Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data , 2011, BMC Systems Biology.

[4]  M. Pirinen,et al.  Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA , 2016, Nature Communications.

[5]  Christian Gieger,et al.  Meta-Analysis of 28,141 Individuals Identifies Common Variants within Five New Loci That Influence Uric Acid Concentrations , 2009, PLoS genetics.

[6]  M. Ritchie,et al.  Methods of integrating data to uncover genotype–phenotype interactions , 2015, Nature Reviews Genetics.

[7]  M. Stephens A Unified Framework for Association Analysis with Multiple Related Phenotypes , 2013, PloS one.

[8]  Igor Rudan,et al.  Multivariate discovery and replication of five novel loci associated with Immunoglobulin G N-glycosylation , 2017, Nature Communications.

[9]  Xingwei Tong,et al.  Statistical Methods for Testing Genetic Pleiotropy , 2016, Genetics.

[10]  Christian Gieger,et al.  A genome-wide perspective of genetic variation in human metabolism , 2010, Nature Genetics.

[11]  C. Gieger,et al.  KORA-gen - Resource for Population Genetics, Controls and a Broad Spectrum of Disease Phenotypes , 2005 .

[12]  Thomas Meitinger,et al.  Genetic Determinants of Circulating Sphingolipid Concentrations in European Populations , 2009, PLoS genetics.

[13]  J. Cheverud,et al.  A COMPARISON OF GENETIC AND PHENOTYPIC CORRELATIONS , 1988, Evolution; international journal of organic evolution.

[14]  Paolo Bientinesi,et al.  High-Performance Mixed Models Based Genome-Wide Association Analysis with omicABEL software , 2014, F1000Research.

[15]  Niku Oksala,et al.  Novel Loci for Metabolic Networks and Multi-Tissue Expression Studies Reveal Genes for Atherosclerosis , 2012, PLoS genetics.

[16]  Joseph K. Pickrell,et al.  Detection and interpretation of shared genetic influences on 42 human traits , 2015, Nature Genetics.

[17]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[18]  D. Falconer,et al.  Introduction to Quantitative Genetics. , 1962 .

[19]  M. Lynch,et al.  Genetics and Analysis of Quantitative Traits , 1996 .

[20]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[21]  J. Hirschhorn,et al.  Biological interpretation of genome-wide association studies using predicted gene functions , 2015, Nature Communications.

[22]  Giovanni M. Marchetti,et al.  Independencies Induced from a Graphical Markov Model After Marginalization and Conditioning: The R Package ggm , 2006 .

[23]  C. Gieger,et al.  Human metabolic individuality in biomedical and pharmaceutical research , 2011, Nature.

[24]  Jingyuan Fu,et al.  Systems genetics: From GWAS to disease pathways. , 2014, Biochimica et biophysica acta.

[25]  C. Gieger,et al.  Nonadditive Effects of Genes in Human Metabolomics , 2015, Genetics.

[26]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[27]  Stephen Burgess,et al.  PhenoScanner: a database of human genotype–phenotype associations , 2016, Bioinform..

[28]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[29]  M. Kendall Theoretical Statistics , 1956, Nature.

[30]  Wei Pan,et al.  Conditional analysis of multiple quantitative traits based on marginal GWAS summary statistics , 2017, Genetic epidemiology.

[31]  Christian Gieger,et al.  Genome-wide association study identifies novel genetic variants contributing to variation in blood metabolite levels , 2015, Nature Communications.

[32]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[33]  Matti Pirinen,et al.  metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis , 2015, bioRxiv.

[34]  P. O’Reilly,et al.  MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS , 2012, PloS one.

[35]  L. Kiemeney,et al.  A Comparison of Multivariate Genome-Wide Association Methods , 2014, PloS one.

[36]  John P. Overington,et al.  An atlas of genetic influences on human blood metabolites , 2014, Nature Genetics.

[37]  D. Roff The estimation of genetic correlations from phenotypic correlations: a test of Cheverud's conjecture , 1995, Heredity.

[38]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[39]  Jussi Paananen,et al.  Genetic Variants Associated With Glycine Metabolism and Their Role in Insulin Sensitivity and Type 2 Diabetes , 2013, Diabetes.

[40]  Chun Jimmie Ye,et al.  Covariate Selection for Association Screening in Multi-Phenotype Genetic studies , 2017, Nature Genetics.