Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits

We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest) with genotype data at tag SNPs collected on a phenotyped study sample, to estimate (“impute”) unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard single-SNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP) is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene), the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a software package, Bim-Bam, available from the Stephens Lab website http://stephenslab.uchicago.edu/software.html.

[1]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[2]  I. Good The Bayes/Non-Bayes Compromise: A Brief Review , 1992 .

[3]  A. Raftery Approximate Bayes factors and accounting for model uncertainty in generalised linear models , 1996 .

[4]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[5]  L. Almasy,et al.  Multipoint quantitative-trait linkage analysis in general pedigrees. , 1998, American journal of human genetics.

[6]  W. Ewens Genetics and analysis of quantitative traits , 1999 .

[7]  J. Cheverud Genetics and analysis of quantitative traits , 1999 .

[8]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[9]  Juliet M Chapman,et al.  Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical Power , 2003, Human Heredity.

[10]  Nicholas W Wood,et al.  Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. , 2003, American journal of human genetics.

[11]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[12]  D. Goldstein,et al.  Identifying candidate causal variants responsible for altered activity of the ABCB1 multidrug resistance gene. , 2004, Genome research.

[13]  C. Carlson,et al.  Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. , 2004, American journal of human genetics.

[14]  P. Green,et al.  Bayesian Variable Selection and the Swendsen-Wang Algorithm , 2004 .

[15]  Alexander Pertsemlidis,et al.  Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9 , 2005, Nature Genetics.

[16]  M. Stephens,et al.  Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-data Imputation , 2022 .

[17]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[18]  Josemir W Sander,et al.  Genetic predictors of the maximum doses patients receive during clinical use of the anti-epileptic drugs carbamazepine and phenytoin. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Sebastian Zöllner,et al.  Coalescent-Based Association Mapping and Fine Mapping of Complex Trait Loci , 2005, Genetics.

[20]  Mikko J Sillanpää,et al.  Bayesian Association-Based Fine Mapping in Small Chromosomal Segments , 2005, Genetics.

[21]  D. Altshuler,et al.  Genetic Variation in the HSD17B1 Gene and Risk of Prostate Cancer , 2005, PLoS genetics.

[22]  Thomas R Belin,et al.  Imputation and Variable Selection in Linear Regression Models with Missing Covariates , 2005, Biometrics.

[23]  Andrew P Morris,et al.  A flexible Bayesian framework for modeling haplotype association with disease, allowing for dominance effects of the underlying causative variants. , 2006, American journal of human genetics.

[24]  David J. Lunn,et al.  A Bayesian toolkit for genetic association studies , 2006, Genetic epidemiology.

[25]  Dan L Nicolae,et al.  Testing Untyped Alleles (TUNA)—applications to genome‐wide association studies , 2006, Genetic epidemiology.

[26]  Zhaohui S. Qin,et al.  A comparison of phasing algorithms for trios and unrelated individuals. , 2006, American journal of human genetics.

[27]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[28]  Ingo Ruczinski,et al.  Imputation Methods to Improve Inference in Snp Association Studies , 2022 .

[29]  Garrett Hellenthal,et al.  msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots , 2007, Bioinform..

[30]  D. Goldstein,et al.  Nova2 interacts with a cis-acting polymorphism to influence the proportions of drug-responsive splice variants of SCN1A. , 2007, American journal of human genetics.