Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics

Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf.

[1]  Peter Donnelly,et al.  HAPGEN2: simulation of multiple disease SNPs , 2011, Bioinform..

[2]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[3]  V. Pankratz,et al.  HLA alleles associated with the adaptive immune response to smallpox vaccine: a replication study , 2014, Human Genetics.

[4]  Sebastian Zöllner,et al.  Coalescent-Based Association Mapping and Fine Mapping of Complex Trait Loci , 2005, Genetics.

[5]  E. Eskin,et al.  Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies , 2014, PLoS genetics.

[6]  James G. Scott,et al.  Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[7]  M. Stephens,et al.  Bayesian variable selection regression for genome-wide association studies and other large-scale problems , 2011, 1110.6019.

[8]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[9]  Li Li,et al.  Incorporating Prior Biologic Information for High-Dimensional Rare Variant Association Studies , 2013, Human Heredity.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  David V Conti,et al.  Incorporating model uncertainty in detecting rare variants: the Bayesian risk index , 2011, Genetic epidemiology.

[12]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[13]  Peter Kraft,et al.  Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification , 2013, PLoS genetics.

[14]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[15]  R. Vierkant,et al.  Human leukocyte antigen genotypes in the genetic control of adaptive immune responses to smallpox vaccine. , 2011, The Journal of infectious diseases.

[16]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[17]  Kung-Yee Liang,et al.  Multipoint linkage disequilibrium mapping using case‐control designs , 2005, Genetic epidemiology.

[18]  Scott C Schmidler,et al.  BAYESIAN MODEL SEARCH AND MULTILEVEL INFERENCE FOR SNP ASSOCIATION STUDIES. , 2009, The annals of applied statistics.

[19]  Andrew P Morris,et al.  Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. , 2004, American journal of human genetics.

[20]  P. Armitage Tests for Linear Trends in Proportions and Frequencies , 1955 .

[21]  V. Pankratz,et al.  Genome-wide association study of antibody response to smallpox vaccine. , 2012, Vaccine.

[22]  Yongtao Guan,et al.  Practical Issues in Imputation-Based Association Mapping , 2008, PLoS genetics.

[23]  R. Durbin,et al.  Mapping trait loci by use of inferred ancestral recombination graphs. , 2006, American journal of human genetics.

[24]  D J Balding,et al.  Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. , 2002, American journal of human genetics.

[25]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[26]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[27]  Xiaoquan Wen,et al.  Bayesian model selection in complex linear systems, as illustrated in genetic association studies , 2013, Biometrics.

[28]  M A Quintana,et al.  Integrative variable selection via Bayesian model uncertainty , 2013, Statistics in medicine.

[29]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[30]  D. Balding,et al.  Fine mapping of disease genes via haplotype clustering , 2006, Genetic epidemiology.

[31]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[32]  Eleazar Eskin,et al.  Identifying Causal Variants at Loci with Multiple Signals of Association , 2014, Genetics.

[33]  Gregory A. Poland,et al.  Genome-wide analysis of polymorphisms associated with cytokine responses in smallpox vaccine recipients , 2012, Human Genetics.

[34]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[35]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .