Widespread allelic heterogeneity in complex traits

Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWAS and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4-23% in eQTLs, 35% in GWAS of High-Density Lipoprotein (HDL), and 23% in schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R2=0.85, P = 2.2e-16), indicating that statistical power prevents identification of AH in other loci. Understanding the extent of AH may guide the development of new methods for fine mapping and association mapping of complex traits.

[1]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[2]  P. Visscher,et al.  Another Explanation for Apparent Epistasis , 2014 .

[3]  X. Estivill,et al.  Geographic distribution and regional origin of 272 cystic fibrosis mutations in European populations , 1997 .

[4]  Eleazar Eskin,et al.  Increasing Power in Association Studies by Using Linkage Disequilibrium Structure and Molecular Function as Prior Information , 2008, RECOMB.

[5]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[6]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[7]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[8]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[9]  Peter Donnelly,et al.  HAPGEN2: simulation of multiple disease SNPs , 2011, Bioinform..

[10]  Gaurav Bhatia,et al.  Fast and accurate imputation of summary statistics enhances evidence of functional enrichment , 2013, Bioinform..

[11]  Jong Wha J. Joo,et al.  Multiple testing correction in linear mixed models , 2016, Genome Biology.

[12]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[13]  Eleazar Eskin,et al.  Incorporating prior information into association studies , 2012, Bioinform..

[14]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[15]  George P Patrinos,et al.  HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server , 2002, Human mutation.

[16]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[17]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[18]  Bonnie Berger,et al.  Efficient Bayesian mixed model analysis increases association power in large cohorts , 2014 .

[19]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[20]  Warren W. Kretzschmar,et al.  Sparse whole genome sequencing identifies two loci for major depressive disorder , 2015, Nature.

[21]  J. Lupski,et al.  Non-coding genetic variants in human disease. , 2015, Human molecular genetics.

[22]  Joseph E. Powell,et al.  Detection and replication of epistasis influencing transcription in humans , 2014, Nature.

[23]  John A. Todd,et al.  Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes , 2012, Human molecular genetics.

[24]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[25]  Eleazar Eskin,et al.  Identification of causal genes for complex traits , 2015, Bioinform..

[26]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[27]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[28]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[29]  C. Spencer,et al.  Biological Insights From 108 Schizophrenia-Associated Genetic Loci , 2014, Nature.

[30]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[31]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[32]  Eleazar Eskin,et al.  Interpreting Meta-Analyses of Genome-Wide Association Studies , 2012, PLoS genetics.

[33]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[34]  Ayellet V. Segrè,et al.  Colocalization of GWAS and eQTL Signals Detects Target Genes , 2016, bioRxiv.

[35]  Simon C. Potter,et al.  The Architecture of Gene Regulatory Variation across Multiple Human Tissues: The MuTHER Study , 2011, PLoS genetics.

[36]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[37]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[38]  Jonathan P. Beauchamp,et al.  GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment , 2013, Science.

[39]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[40]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[41]  Eleazar Eskin,et al.  Identifying Causal Variants at Loci with Multiple Signals of Association , 2014, Genetics.

[42]  Eleazar Eskin,et al.  An Optimal Weighted Aggregated Association Test for Identification of Rare Variants Involved in Common Diseases , 2011, Genetics.

[43]  A. Dunning,et al.  Beyond GWASs: illuminating the dark road from association to function. , 2013, American journal of human genetics.

[44]  D. Clayton,et al.  Genome-wide association study and meta-analysis finds over 40 loci affect risk of type 1 diabetes , 2009, Nature Genetics.

[45]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.