Analyses and Comparison of Imputation-Based Association Methods

Genotype imputation methods have become increasingly popular for recovering untyped genotype data. An important application with imputed genotypes is to test genetic association for diseases. Imputation-based association test can provide additional insight beyond what is provided by testing on typed tagging SNPs only. A variety of effective imputation-based association tests have been proposed. However, their performances are affected by a variety of genetic factors, which have not been well studied. In this study, using both simulated and real data sets, we investigated the effects of LD, MAF of untyped causal SNP and imputation accuracy rate on the performances of seven popular imputation-based association methods, including MACH2qtl/dat, SNPTEST, ProbABEL, Beagle, Plink, BIMBAM and SNPMStat. We also aimed to provide a comprehensive comparison among methods. Results show that: 1). imputation-based association tests can boost signals and improve power under medium and high LD levels, with the power improvement increasing with strengthening LD level; 2) the power increases with higher MAF of untyped causal SNPs under medium to high LD level; 3). under low LD level, a high imputation accuracy rate cannot guarantee an improvement of power; 4). among methods, MACH2qtl/dat, ProbABEL and SNPTEST perform similarly and they consistently outperform other methods. Our results are helpful in guiding the choice of imputation-based association test in practical application.

[1]  Alex Doney,et al.  Genetic variation in GIPR influences the glucose and insulin responses to an oral glucose challenge , 2010, Nature Genetics.

[2]  Christian Gieger,et al.  New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk , 2010, Nature Genetics.

[3]  Yan Guo,et al.  Genome-wide association and follow-up replication studies identified ADAMTS18 and TGFBR3 as bone mass candidate genes in different ethnic groups. , 2009, American journal of human genetics.

[4]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[5]  Ming D. Li,et al.  Genome-wide Association Analyses Suggested a Novel Mechanism for Smoking Behavior Regulated by IL15 , 2009, Molecular Psychiatry.

[6]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[7]  Eric E Schadt,et al.  Accuracy of Genome-wide Imputation of Untyped Markers and Impacts on Statistical Power for Association Studies , 2009 .

[8]  J. Chang-Claude,et al.  Impact of genotyping errors on the type I error rate and the power of haplotype-based association methods , 2009, BMC Genetics.

[9]  Yongtao Guan,et al.  Practical Issues in Imputation-Based Association Mapping , 2008, PLoS genetics.

[10]  Hong-Wen Deng,et al.  Analyses and Comparison of Accuracy of Different Genotype Imputation Methods , 2008, PloS one.

[11]  Jonathan Marchini,et al.  Comparing algorithms for genotype imputation. , 2008, American journal of human genetics.

[12]  A. Morris,et al.  Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms. , 2008, American journal of human genetics.

[13]  Philippe Froguel,et al.  Genome-wide association scans identified CTNNBL1 as a novel gene for obesity. , 2008, Human molecular genetics.

[14]  D. Lin,et al.  Simple and efficient analysis of disease association with missing genotype data. , 2008, American journal of human genetics.

[15]  Shah Ebrahim,et al.  Common variants in the GDF5-UQCC region are associated with variation in human height , 2008, Nature Genetics.

[16]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[17]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[18]  Kenjiro Taura,et al.  Evaluation of genome-wide power of genetic association studies based on empirical data from the HapMap project. , 2007, Human molecular genetics.

[19]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[20]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[21]  Jian Li,et al.  Conjuring SNPs to detect associations , 2007, Nature Genetics.

[22]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[23]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[24]  Yurii S. Aulchenko,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm108 Genetics and population analysis GenABEL: an R library for genome-wide association analysis , 2022 .

[25]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[26]  Dan L Nicolae,et al.  Quantifying the amount of missing information in genetic association studies , 2006, Genetic epidemiology.

[27]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[28]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[29]  S. Gabriel,et al.  Efficiency and power in genetic association studies , 2005, Nature Genetics.

[30]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[31]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[32]  Lon R. Cardon,et al.  The complex interplay among factors that influence allelic association , 2004, Nature Reviews Genetics.

[33]  D. Gudbjartsson,et al.  A high-resolution recombination map of the human genome , 2002, Nature Genetics.

[34]  E. R. Cohen An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements , 1998 .

[35]  E Richard Cohen An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements , 1998 .

[36]  P. Dougherty,et al.  The immune system and opiate withdrawal. , 1989, International journal of immunopharmacology.

[37]  F. Bloom,et al.  Cellular and molecular mechanisms of drug dependence. , 1988, Science.

[38]  N. Dafny,et al.  Evidence that opiate addiction is in part an immune response Destruction of the immune system by irradiation-altered opiate withdrawal , 1986, Neuropharmacology.