Evaluating the contribution of rare variants to type 2 diabetes and related traits using pedigrees

Significance Contributions of rare variants to common and complex traits such as type 2 diabetes (T2D) are difficult to measure. This paper describes our results from deep whole-genome analysis of large Mexican-American pedigrees to understand the role of rare-sequence variations in T2D and related traits through enriched allele counts in pedigrees. Our study design was well-powered to detect association of rare variants if rare variants with large effects collectively accounted for large portions of risk variability, but our results did not identify such variants in this sample. We further quantified the contributions of common and rare variants in gene expression profiles and concluded that rare expression quantitative trait loci explain a substantive, but minor, portion of expression heritability. A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.

[1]  Gao Wang,et al.  The impact of rare variation on gene expression across tissues , 2016, Nature.

[2]  Stephen C. J. Parker,et al.  The genetic architecture of type 2 diabetes , 2016, Nature.

[3]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[4]  Alan M. Kwong,et al.  Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers , 2015, Nature Genetics.

[5]  M. Boehnke,et al.  Recent advances in understanding the genetic architecture of type 2 diabetes. , 2015, Human molecular genetics.

[6]  G. Abecasis,et al.  An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data , 2015, Genome research.

[7]  Francesco Cucca,et al.  Methods for Association Analysis and Meta‐Analysis of Rare Variants in Families , 2015, Genetic epidemiology.

[8]  L. Groop,et al.  Genetics of Type 2 Diabetes—Pitfalls and Possibilities , 2015, Genes.

[9]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[10]  J. Al-Aama,et al.  A common Greenlandic TBC1D4 variant confers muscle insulin resistance and type 2 diabetes , 2014, Nature.

[11]  Tanya M. Teslovich,et al.  Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes , 2012, Nature Genetics.

[12]  J. Lupski,et al.  Clan Genomics and the Complex Architecture of Human Disease , 2011, Cell.

[13]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[14]  D. Goldstein,et al.  Uncovering the roles of rare variants in common disease through whole-genome sequencing , 2010, Nature Reviews Genetics.

[15]  M. King,et al.  Genetic Heterogeneity in Human Disease , 2010, Cell.

[16]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[17]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[18]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[19]  Justin C. Fay,et al.  Identification of deleterious mutations within three human genomes. , 2009, Genome research.

[20]  L. Almasy,et al.  Genetics of atherosclerosis risk factors in Mexican Americans. , 2009, Nutrition reviews.

[21]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[22]  L. Almasy,et al.  Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes , 2007, Nature Genetics.

[23]  J. Pritchard,et al.  Overcoming the winner's curse: estimating penetrance parameters from case-control data. , 2007, American journal of human genetics.

[24]  Richard P. Granato,et al.  A genomewide search finds major susceptibility loci for gallbladder disease on chromosome 1 in Mexican Americans. , 2006, American journal of human genetics.

[25]  John Blangero,et al.  Genome-wide linkage analyses of type 2 diabetes in Mexican Americans: the San Antonio Family Diabetes/Gallbladder Study. , 2005, Diabetes.

[26]  J. Blangero,et al.  Genetic and environmental contributions to cardiovascular risk factors in Mexican Americans. The San Antonio Family Heart Study. , 1996, Circulation.

[27]  P. Visscher,et al.  Title: Across-cohort Qc Analyses of Genome-wide Association Study Summary Statistics from Complex Traits Wray 1 , the Genetic Investigation of Anthropometric Traits (giant) Consortium , 2015 .

[28]  John Blangero,et al.  A kernel of truth: statistical advances in polygenic variance component models for complex human pedigrees. , 2013, Advances in genetics.

[29]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[30]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.