Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits

Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene–trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2, known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits. Many genetic variants identified in genome-wide association studies are associated with gene expression. Here, Porcu et al. propose a transcriptome-wide summary statistics-based Mendelian randomization approach (TWMR) that, applied to 43 human traits, uncovers hundreds of previously unreported gene–trait associations.

Zoltán Kutalik | Kaido Lepik | Alexandre Reymond | Sina A. Gharib | Sina Rüeger | P. Visscher | T. Lehtimäki | A. Reymond | M. Perola | M. Bonder | C. Wijmenga | Z. Kutalik | P. Awadalla | A. Battle | B. Psaty | S. Ripatti | T. Esko | G. Montgomery | L. Milani | H. Prokisch | D. Boomsma | M. Stumvoll | M. Swertz | U. Marigorta | L. H. van den Berg | J. Veldink | J. Kettunen | J. Hottenga | A. Isaacs | M. Beekman | H. Westra | J. Deelen | M. Müller-Nurasyid | P. Slagboom | C. V. van Duijn | L. Franke | Yungil Kim | B. Pierce | H. Ahsan | K. Schramm | A. Teumer | U. Völker | E. Slagboom | M. Vermaat | F. van Dijk | Patrick Deelen | M. van Iterson | Jian Yang | M. Kähönen | A. Andiappan | O. Rotzschke | S. Kasela | A. Saha | K. Krohn | M. Nauck | J. van Dongen | P. Kovacs | A. Tönjes | G. Hemani | E. Porcu | D. van Heemst | S. Rüeger | H. Yaghootkar | Bernett Lee | R. Jansen | I. Nooren | M. Christiansen | R. Luijk | Matthijs Moed | J. Bot | P. M. Jhamai | M. Verbiest | H. Suchiman | R. van der Breggen | J. V. van Rooij | N. Lakenberg | W. Arindrarto | S. Kiełbasa | E. Tigchelaar | R. Pool | C. V. D. van der Kallen | C. Schalkwijk | E. V. van Zwet | H. Mei | P. Slagboom | B. Heijmans | Futao Zhang | J. Thiery | Patrick F. Sullivan | F. Santoni | U. Võsa | B. Zeng | M. Nivard | H. Kirsten | A. Claringbould | N. Pervjakova | M. Favé | M. Agbessi | I. Seppälä | L. Tong | J. Verlouw | V. Kukushkina | A. Kalnapenkis | J. Kronberg-Guzman | F. Beutner | H. Mei | G. Gibson | C. Stehouwer | B. Penninx | I. Alves | Marc Jan Bonder | Federico A Santoni | Eleonora Porcu | S. Zhernakova | B. Hofman | Dasha V. Zhernakova | M. Loeffler | P. Sullivan | J. van Meurs | O. Raitakari | Kaido Lepik | Markus Scholz | P. A. '. ‘t Hoen | M. V. van Greevenbroek | J. Powell | P. ’. ’t Hoen | A. Uitterlinden | M. Verkerk | D. V. Zhernakova | P. V. van‘t Hof | M. van Galen | Peter A. C. ’t Hoen | T. Frayling | Mawussé Habibul Isabel Anand Wibowo Philip Alexis Frank Ma Agbessi Ahsan Alves Andiappan Arindrarto | Wibowo Marian Dorret I. Jan Joris Patrick Lude Bastiaan T Arindrarto Beekman Boomsma Bot Deelen Deel | M. Kähönen | A. Uitterlinden | C. V. van Duijn | B. Psaty | Coen D. A. Stehouwer | Olli T. Raitakari | P. A. ‛. ’t Hoen | M. Moed | H. Mei | Hailang Mei | Dasha V Zhernakova

[1]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[2]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[3]  N. Sheehan,et al.  A framework for the investigation of pleiotropy in two‐sample summary data Mendelian randomization , 2017, Statistics in medicine.

[4]  C. E. Pearson,et al.  Table S2: Trans-factors and trinucleotide repeat instability Trans-factor , 2010 .

[5]  B. Pierce,et al.  Efficient Design for Mendelian Randomization Studies: Subsample and 2-Sample Instrumental Variable Estimators , 2013, American journal of epidemiology.

[6]  Allissa Dillman,et al.  Edinburgh Research Explorer Integration of GWAS SNPs and tissue specific expression profiling reveal discrete eQTLs for human traits in blood and brain , 2022 .

[7]  A. Bowcock,et al.  Localization of a gene for familial recurrent arthritis. , 2000, Arthritis and rheumatism.

[8]  C. Summerbell,et al.  Childhood predictors of adult obesity: a systematic review. , 1999, International journal of obesity and related metabolic disorders : journal of the International Association for the Study of Obesity.

[9]  Fernando Pires Hartwig,et al.  Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption , 2017, bioRxiv.

[10]  Manolis Kellis,et al.  FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. , 2015, The New England journal of medicine.

[11]  Sina A. Gharib,et al.  Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis , 2018, bioRxiv.

[12]  P. Wilson,et al.  Educational attainment and coronary heart disease risk: the Framingham Offspring Study. , 1993, Preventive medicine.

[13]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[14]  G. Siest,et al.  Family study of the relationship between height and cardiovascular risk factors in the STANISLAS cohort. , 2003, International journal of epidemiology.

[15]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[16]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[17]  F. Dudbridge,et al.  Re: "Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects". , 2015, American journal of epidemiology.

[18]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[19]  Matti Pirinen,et al.  FINEMAP: efficient variable selection using summary data from genome-wide association studies , 2015, bioRxiv.

[20]  F. Alkuraya,et al.  Genomic analysis of primordial dwarfism reveals novel disease genes , 2014, Genome research.

[21]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[22]  Jingyuan Fu,et al.  Trans-eQTLs Reveal That Independent Genetic Variants Associated with a Complex Phenotype Converge on Intermediate Genes, with a Major Role for the HLA , 2011, PLoS genetics.

[23]  P. Visscher,et al.  Calculating statistical power in Mendelian randomization studies. , 2013, International journal of epidemiology.

[24]  J. Söderman,et al.  Gene Expression-Genotype Analysis Implicates GSDMA, GSDMB, and LRRC3C as Contributors to Inflammatory Bowel Disease Susceptibility , 2015, BioMed research international.

[25]  A. Miller,et al.  Age at menarche in relation to adult height: the EPIC study. , 2005, American journal of epidemiology.

[26]  B. Neale,et al.  Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases , 2018, Nature Genetics.

[27]  L. Hurst,et al.  Neighboring Genes Show Correlated Evolution in Gene Expression , 2015, Molecular biology and evolution.

[28]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[29]  R. Gibbs,et al.  Whole exome sequencing identifies the first STRADA point mutation in a patient with polyhydramnios, megalencephaly, and symptomatic epilepsy syndrome (PMSE) , 2016, American journal of medical genetics. Part A.

[30]  Carol Moreno,et al.  Identifying multiple causative genes at a single GWAS locus , 2013, Genome research.

[31]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[32]  Aaron F. McDaid,et al.  Bayesian association scan reveals loci associated with human lifespan and linked biomarkers , 2017, Nature Communications.

[33]  A. J. Mcadams,et al.  Hepatic cholesterol ester storage disease, a familial disorder. I. Clinical aspects. , 1968, The American journal of medicine.

[34]  Á. Ruibal,et al.  A new seipin-associated neurodegenerative syndrome , 2013, Journal of Medical Genetics.

[35]  Dylan S. Small,et al.  A review of instrumental variable estimators for Mendelian randomization , 2015, Statistical methods in medical research.

[36]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[37]  Marcelo P. Segura-Lepe,et al.  Rare and low-frequency coding variants alter human adult height , 2016, Nature.

[38]  O. Delaneau,et al.  Estimating the causal tissues for complex traits and diseases , 2016, Nature Genetics.

[39]  D. Levinson,et al.  Genetic Correlation Profile of Schizophrenia Mirrors Epidemiological Results and Suggests Link Between Polygenic and Rare Variant (22q11.2) Cases of Schizophrenia , 2017, Schizophrenia bulletin.

[40]  M. Lynch,et al.  Genetics and Analysis of Quantitative Traits , 1996 .

[41]  Sina A. Gharib,et al.  Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood , 2018, Nature Communications.

[42]  Eun Yong Kang,et al.  Identifying Causal Variants at Loci with Multiple Signals of Association , 2014, Genetics.

[43]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[44]  Atsushi Inoue,et al.  Two-Sample Instrumental Variables Estimators , 2010 .

[45]  G. Davey Smith,et al.  Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator , 2016, Genetic epidemiology.

[46]  Alexander Gusev,et al.  Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. , 2017, American journal of human genetics.

[47]  Summary statistics for multiple and conditional regression analyses , 2011 .

[48]  Valeriia Haberland,et al.  The MR-Base platform supports systematic causal inference across the human phenome , 2018, eLife.

[49]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[50]  Tom R. Gaunt,et al.  Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel , 2015, Nature Communications.

[51]  G. Davey Smith,et al.  Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression , 2015, International journal of epidemiology.

[52]  Johnny S. H. Kwan,et al.  GATES: a rapid and powerful gene-based association test using extended Simes procedure. , 2011, American journal of human genetics.

[53]  M. Dattani,et al.  Mutation in the TBCE gene is associated with hypoparathyroidism-retardation-dysmorphism syndrome featuring pituitary hormone deficiencies and hypoplasia of the anterior pituitary and the corpus callosum. , 2009, The Journal of clinical endocrinology and metabolism.

[54]  A. Price,et al.  Distinguishing genetic correlation from causation across 52 diseases and complex traits , 2017, Nature Genetics.

[55]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[56]  Stephen Burgess,et al.  Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods , 2015, Statistics in medicine.

[57]  E. Friedberg,et al.  The Cockayne syndrome group A gene encodes a WD repeat protein that interacts with CSB protein and a subunit of RNA polymerase II TFIIH , 1995, Cell.

[58]  N. Dagoneau,et al.  RAB23 mutation in a large family from Comoros Islands with Carpenter syndrome , 2010, American journal of medical genetics. Part A.

[59]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[60]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[61]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[62]  Stephen Burgess,et al.  Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants , 2016, Epidemiology.

[63]  Nuala A Sheehan,et al.  Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome , 2015, Statistics in medicine.