A novel Mendelian randomization method identifies causal relationships between gene expression and low-density lipoprotein cholesterol levels

Robust inference of causal relationships between gene expression and complex traits using Mendelian Randomization (MR) approaches is confounded by pleiotropy and linkage disequilibrium (LD) between gene expression quantitative loci (eQTLs). Here we propose a new MR method, MR-link, that accounts for unobserved pleiotropy and LD by leveraging information from individual-level data. In simulations, MR-link shows false positive rates close to expectation (median 0.05) and high power (up to 0.89), outperforming all other MR methods we tested, even when only one eQTL variant is present. Application of MR-link to low-density lipoprotein cholesterol (LDL-C) measurements in 12,449 individuals and eQTLs summary statistics from whole blood and liver identified 19 genes causally linked to LDL-C. These include the previously functionally validated SORT1 gene, and the PVRL2 gene, located in the APOE locus, for which a causal role in liver was yet unknown. Our results showcase the strength of MR-link for transcriptome-wide causal inferences.

[1]  Michael J. Gloudemans,et al.  Abundant associations with gene expression complicate GWAS follow-up , 2019, Nature Genetics.

[2]  Yang I Li,et al.  Trans Effects on Gene Expression Can Drive Omnigenic Inheritance , 2018, Cell.

[3]  Tanya M. Teslovich,et al.  Genetics of Blood Lipids Among ~300,000 Multi-Ethnic Participants of the Million Veteran Program , 2018, Nature Genetics.

[4]  Zoltán Kutalik,et al.  Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits , 2019, Nature Communications.

[5]  P. O’Reilly,et al.  Using genetic data to strengthen causal inference in observational research , 2018, Nature Reviews Genetics.

[6]  Peter Kraft,et al.  Transcriptome‐wide association studies accounting for colocalization using Egger regression , 2018, Genetic epidemiology.

[7]  Benjamin A. Logsdon,et al.  Landscape of Conditional eQTL in Dorsolateral Prefrontal Cortex and Co-localization with Schizophrenia GWAS , 2018, American journal of human genetics.

[8]  Hao Li,et al.  DEPP/DEPP1/C10ORF10 regulates hepatic glucose and fat metabolism partly via ROS‐induced FGF21 , 2018, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[9]  Christopher N. Foley,et al.  Inferring Causal Relationships Between Risk Factors and Outcomes from Genome-Wide Association Study Data , 2018, Annual review of genomics and human genetics.

[10]  B. Neale,et al.  Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases , 2018, Nature Genetics.

[11]  G. Silecchia,et al.  Neurotensin Is a Lipid-Induced Gastrointestinal Peptide Associated with Visceral Adipose Tissue Inflammation in Obesity , 2018, Nutrients.

[12]  Derek T. Peters,et al.  Interrogation of the Atherosclerosis-Associated SORT1 (Sortilin 1) Locus With Primary Human Hepatocytes, Induced Pluripotent Stem Cell-Hepatocytes, and Locus-Humanized Mice , 2018, Arteriosclerosis, thrombosis, and vascular biology.

[13]  Peter A C 't Hoen,et al.  Genome-wide identification of directed gene networks using large-scale population genomics data , 2017, Nature Communications.

[14]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[15]  Brian A Ference,et al.  Association of Genetic Variants Related to CETP Inhibitors and Statins With Lipoprotein Levels and Cardiovascular Risk , 2017, JAMA.

[16]  Robert M. Maier,et al.  Causal associations between risk factors and common diseases inferred from GWAS summary data , 2017, Nature Communications.

[17]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[18]  S. Thompson,et al.  Interpreting findings from Mendelian randomization using the MR-Egger method , 2017, European Journal of Epidemiology.

[19]  K. Tokunaga,et al.  The first genome-wide association study identifying new susceptibility loci for obstetric antiphospholipid syndrome , 2017, Journal of Human Genetics.

[20]  J. Björkegren,et al.  Poliovirus Receptor–Related 2: A Cholesterol-Responsive Gene Affecting Atherosclerosis Development by Modulating Leukocyte Migration , 2017, Arteriosclerosis, thrombosis, and vascular biology.

[21]  O. Delaneau,et al.  Estimating the causal tissues for complex traits and diseases , 2016, Nature Genetics.

[22]  A. Hofman,et al.  Identification of context-dependent expression quantitative trait loci in whole blood , 2016, Nature Genetics.

[23]  G. Willemsen,et al.  The Genetic Overlap Between Hair and Eye Color , 2016, Twin Research and Human Genetics.

[24]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[25]  M. Beekman,et al.  Employing biomarkers of healthy ageing for leveraging genetic studies into human longevity , 2016, Experimental Gerontology.

[26]  Symen Ligthart,et al.  Bivariate genome-wide association study identifies novel pleiotropic loci for lipids and inflammation , 2016, BMC Genomics.

[27]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[28]  David A. Knowles,et al.  RNA splicing is a primary link between genetic variation and disease , 2016, Science.

[29]  Mary Brophy,et al.  Million Veteran Program: A mega-biobank to study genetic influences on health and disease. , 2016, Journal of clinical epidemiology.

[30]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[31]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[32]  David M. Evans,et al.  Mendelian Randomization: New Applications in the Coming Age of Hypothesis-Free Causality. , 2015, Annual review of genomics and human genetics.

[33]  A. Zhernakova,et al.  Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics , 2015, BMJ Open.

[34]  R. Mägi,et al.  Cohort Profile Cohort Profile : Estonian Biobank of the Estonian Genome Center , University of Tartu , 2015 .

[35]  C. Wijmenga,et al.  Cohort Profile Cohort Profile : LifeLines , a three-generation cohort study and biobank , 2015 .

[36]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[37]  G. Davey Smith,et al.  Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression , 2015, International journal of epidemiology.

[38]  S. Thompson,et al.  Mendelian Randomization , 2015 .

[39]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[40]  M. Nielsen,et al.  Sortilin, Encoded by the Cardiovascular Risk Gene SORT1, and Its Suggested Functions in Cardiovascular Disease , 2015, Current Atherosclerosis Reports.

[41]  S. Thompson,et al.  Multivariable Mendelian Randomization: The Use of Pleiotropic Genetic Variants to Estimate Causal Effects , 2015, American journal of epidemiology.

[42]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[43]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[44]  G. G. Galli,et al.  Prdm5 suppresses ApcMin-driven intestinal adenomas and regulates monoacylglycerol lipase expression , 2014, Oncogene.

[45]  C. Beetz,et al.  Functional Mutation Analysis Provides Evidence for a Role of REEP1 in Lipid Droplet Biology , 2014, Human mutation.

[46]  K. Moore,et al.  Netrin-1 promotes adipose tissue macrophage accumulation and insulin resistance in obesity , 2014, Nature Medicine.

[47]  Pieter B. T. Neerincx,et al.  The Genome of the Netherlands: design, and project goals , 2013, European Journal of Human Genetics.

[48]  A. Lusis,et al.  Gene Expression Analyses of Mouse Aortic Endothelium in Response to Atherogenic Stimuli , 2013, Arteriosclerosis, thrombosis, and vascular biology.

[49]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[50]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[51]  R. Pepperkok,et al.  RNAi–Based Functional Profiling of Loci from Blood Lipid Genome-Wide Association Studies Identifies Genes with Cholesterol-Regulatory Function , 2013, PLoS genetics.

[52]  Thomas Horn,et al.  GenomeRNAi: a database for cell-based and in vivo RNAi phenotypes, 2013 update , 2012, Nucleic Acids Res..

[53]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[54]  K. Williams,et al.  Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analysis. , 2012, Journal of the American College of Cardiology.

[55]  John Spertus,et al.  Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study , 2012, The Lancet.

[56]  J. Marchini,et al.  Fast and accurate genotype imputation in genome-wide association studies through pre-phasing , 2012, Nature Genetics.

[57]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[58]  O. Franco,et al.  The Rotterdam Study: 2012 objectives and design update , 2011, European Journal of Epidemiology.

[59]  Erika Cule,et al.  Significance testing in ridge regression for genetic data , 2011, BMC Bioinformatics.

[60]  Peter Donnelly,et al.  HAPGEN2: simulation of multiple disease SNPs , 2011, Bioinform..

[61]  M. Fornage,et al.  Genetic Loci Associated with Plasma Phospholipid n-3 Fatty Acids: A Meta-Analysis of Genome-Wide Association Studies from the CHARGE Consortium , 2011, PLoS genetics.

[62]  Leonard H van den Berg,et al.  Population based epidemiology of amyotrophic lateral sclerosis using capture–recapture methodology , 2011, Journal of Neurology, Neurosurgery & Psychiatry.

[63]  E. Feskens,et al.  The cross‐sectional association between insulin resistance and circulating complement C3 is partly explained by plasma alanine aminotransferase, independent of central obesity and general inflammation (the CODAM study) , 2011, European journal of clinical investigation.

[64]  P. Madsen,et al.  Sort1, encoded by the cardiovascular risk locus 1p13.3, is a regulator of hepatic lipoprotein export. , 2010, Cell metabolism.

[65]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[66]  Monique M. B. Breteler,et al.  The Rotterdam Study: 2016 objectives and design update , 2015, European Journal of Epidemiology.

[67]  P. Mehlen,et al.  Lipid raft localization and palmitoylation: identification of two requirements for cell death induction by the tumor suppressors UNC5H. , 2008, Experimental cell research.

[68]  J. Tegnér,et al.  Transcriptional Profiling Uncovers a Network of Cholesterol-Responsive Atherosclerosis Target Genes , 2008, PLoS genetics.

[69]  T. Church,et al.  Cholesterol-lowering effects of bovine serum immunoglobulin in participants with mild hypercholesterolemia. , 2005, The American journal of clinical nutrition.

[70]  B. Ander,et al.  Polyunsaturated fatty acids and their effects on cardiovascular disease. , 2003, Experimental and clinical cardiology.

[71]  R. Levy,et al.  Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. , 1972, Clinical chemistry.