A Powerful Framework for Integrating eQTL and GWAS Summary Data

Two new gene-based association analysis methods, called PrediXcan and TWAS for GWAS individual-level and summary data, respectively, were recently proposed to integrate GWAS with eQTL data, alleviating two common problems in GWAS by boosting statistical power and facilitating biological interpretation of GWAS discoveries. Based on a novel reformulation of PrediXcan and TWAS, we propose a more powerful gene-based association test to integrate single set or multiple sets of eQTL data with GWAS individual-level data or summary statistics. The proposed test was applied to several GWAS datasets, including two lipid summary association datasets based on ∼100,000 and ∼189,000 samples, respectively, and uncovered more known or novel trait-associated genes, showcasing much improved performance of our proposed method. The software implementing the proposed method is freely available as an R package.

[1]  Gaurav Bhatia,et al.  Fast and accurate imputation of summary statistics enhances evidence of functional enrichment , 2013, Bioinform..

[2]  Larry Wasserman,et al.  Using linkage genome scans to improve power of association in genome scans. , 2006, American journal of human genetics.

[3]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[4]  Xiaotong Shen,et al.  A Powerful and Adaptive Association Test for Rare Variants , 2014, Genetics.

[5]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[6]  Jun Chen,et al.  Small Sample Kernel Association Tests for Human Genetic and Microbiome Association Studies , 2016, Genetic epidemiology.

[7]  Xiaofeng Zhu,et al.  Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. , 2015, American journal of human genetics.

[8]  Manolis Kellis,et al.  Modeling prediction error improves power of transcriptome-wide association studies , 2017, bioRxiv.

[9]  T. Behrens,et al.  Using Gene Expression to Improve the Power of Genome-Wide Association Analysis , 2014, Human Heredity.

[10]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[11]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[12]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[13]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[14]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[15]  Xia Yang,et al.  Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. , 2013, American journal of human genetics.

[16]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[17]  Jennifer Mulle,et al.  A Genome-Wide Scan of Ashkenazi Jewish Crohn's Disease Suggests Novel Susceptibility Loci , 2012, PLoS genetics.

[18]  P. Sullivan,et al.  Heritability and Genomics of Gene Expression in Peripheral Blood , 2014, Nature Genetics.

[19]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[20]  Jason M. Torres,et al.  Meta-analysis of lipid-traits in Hispanics identifies novel loci, population-specific effects, and tissue-specific enrichment of eQTLs , 2016, Scientific Reports.

[21]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[22]  Wei Pan,et al.  Asymptotic tests of association with multiple SNPs in linkage disequilibrium , 2009, Genetic epidemiology.

[23]  Wei Pan,et al.  Adaptive gene- and pathway-trait association testing with GWAS summary statistics , 2016, Bioinform..

[24]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[25]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[26]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[27]  Wei Pan,et al.  Relationship between genomic distance‐based regression and kernel machine regression for multi‐marker association testing , 2011, Genetic epidemiology.

[28]  David C. Wilson,et al.  Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease , 2016, Nature Genetics.

[29]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[30]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[31]  Zhiyuan Xu,et al.  Imaging-wide association study: Integrating imaging endophenotypes in GWAS , 2017, NeuroImage.

[32]  Manolis Kellis,et al.  Multi-tissue polygenic models for transcriptome-wide association studies , 2017, bioRxiv.

[33]  Benjamin Neale,et al.  Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights , 2016 .

[34]  Daniel J Schaid,et al.  Genomic Similarity and Kernel Methods I: Advancements by Building on Mathematical and Statistical Foundations , 2010, Human Heredity.

[35]  Wei Pan,et al.  An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics , 2015, Genetic epidemiology.

[36]  Kai Wang,et al.  Boosting the Power of the Sequence Kernel Association Test by Properly Estimating Its Null Distribution. , 2016, American journal of human genetics.

[37]  Morten Wang Fagerland,et al.  The McNemar test for binary matched-pairs data: mid-p and asymptotic are better than exact conditional , 2013, BMC Medical Research Methodology.

[38]  John A. Todd,et al.  Genome-Wide Association Analysis of Autoantibody Positivity in Type 1 Diabetes Cases , 2011, PLoS genetics.

[39]  Daniel J Schaid,et al.  Genomic Similarity and Kernel Methods II: Methods for Genomic Information , 2010, Human Heredity.

[40]  Kaanan P. Shah,et al.  Integrative cross tissue analysis of gene expression identifies novel type 2 diabetes genes , 2017, bioRxiv.

[41]  Jia Li,et al.  An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies , 2011, 1108.3180.