A method to estimate the contribution of regional genetic associations to complex traits from summary association statistics

Despite considerable efforts, known genetic associations only explain a small fraction of predicted heritability. Regional associations combine information from multiple contiguous genetic variants and can improve variance explained at established association loci. However, regional associations are not easily amenable to estimation using summary association statistics because of sensitivity to linkage disequilibrium (LD). We now propose a novel method, LD Adjusted Regional Genetic Variance (LARGV), to estimate phenotypic variance explained by regional associations using summary statistics while accounting for LD. Our method is asymptotically equivalent to a multiple linear regression model when no interaction or haplotype effects are present. It has several applications, such as ranking of genetic regions according to variance explained or comparison of variance explained by two or more regions. Using height and BMI data from the Health Retirement Study (N = 7,776), we show that most genetic variance lies in a small proportion of the genome and that previously identified linkage peaks have higher than expected regional variance.

[1]  D. Weir,et al.  Does physician communication influence older patients' diabetes self-management and glycemic control? Results from the Health and Retirement Study (HRS). , 2007, The journals of gerontology. Series A, Biological sciences and medical sciences.

[2]  Amanda Sonnega,et al.  Cohort Profile: the Health and Retirement Study (HRS). , 2014, International journal of epidemiology.

[3]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[4]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[5]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[6]  Doug Speed,et al.  Improved heritability estimation from genome-wide SNPs. , 2012, American journal of human genetics.

[7]  Naomi R. Wray,et al.  Haplotypes of common SNPs can explain missing heritability of complex diseases , 2015, bioRxiv.

[8]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[9]  M. Daly,et al.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies , 2014, Nature Genetics.

[10]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[11]  Frank Dudbridge,et al.  A Fast Method that Uses Polygenic Scores to Estimate the Variance Explained by Genome-wide Marker Panels and the Proportion of Variants Affecting a Trait. , 2015, American journal of human genetics.

[12]  P. Visscher,et al.  Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores , 2015, bioRxiv.

[13]  Peter Kraft,et al.  Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis , 2012, Nature Genetics.

[14]  Jennifer R. Harris,et al.  Combined Genome Scans for Body Stature in 6,602 European Twins: Evidence for Common Caucasian Loci , 2007, PLoS genetics.

[15]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[16]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[17]  Sanjay R. Patel,et al.  Genome-wide linkage screen for stature and body mass index in 3.032 families: evidence for sex- and population-specific genetic effects , 2009, European Journal of Human Genetics.

[18]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[19]  A Hofman,et al.  Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949) , 2015, Molecular Psychiatry.

[20]  Brendan Bulik-Sullivan,et al.  Relationship between LD Score and Haseman-Elston Regression , 2015, bioRxiv.

[21]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[22]  Alexander Gusev,et al.  Contrasting regional architectures of schizophrenia and other complex diseases using fast variance components analysis , 2015 .

[23]  Zoltán Kutalik,et al.  A multi-SNP locus-association method reveals a substantial fraction of the missing heritability. , 2012, American journal of human genetics.

[24]  EXACT DISTRIBUTIONS OF R 2 AND ADJUSTED R 2 IN A LINEAR REGRESSION MODEL WITH MULTIVARIATE t ERROR TERMS , 2004 .

[25]  Wei Q. Deng,et al.  Contribution of Large Region Joint Associations to Complex Traits Genetics , 2015, PLoS genetics.

[26]  J. Asimit,et al.  Gene‐ or region‐based analysis of genome‐wide association studies , 2009, Genetic epidemiology.

[27]  Vivian G. Cheung,et al.  Genetics of human gene expression: mapping DNA variants that influence gene expression , 2009, Nature Reviews Genetics.

[28]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[29]  Christian Gieger,et al.  Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture , 2013, Nature Genetics.

[30]  Alkes L. Price,et al.  Quantifying Missing Heritability at Known GWAS Loci , 2013, PLoS genetics.

[31]  Hon-Cheong So,et al.  Uncovering the total heritability explained by all true susceptibility variants in a genome‐wide association study , 2011, Genetic epidemiology.