Detecting epistasis in the presence of linkage disequilibrium: A focused comparison

We present results from a comparison of three epistasis-detection tools using large-scale simulated genetic data: SNPHarvester, SNPRuler and Ambience. The tools were chosen based on their merits to be representative of the state of the art of epistasis detection. We design and conduct experiments to test the performance of the methods in detecting interacting loci or their proxies in linkage disequilibrium (LD) tagged regions, in datasets containing simulated 2,3 and 4-way epistatic interactions. The results show that SNPHarvester is the fastest while Ambience is the most robust. Moreover, SNPRuler provides the best power, specially with higher-level interactions, but cannot scale-up to larger datasets.

[1]  P. Chanda,et al.  Comparison of information-theoretic to statistical methods for gene-gene interactions in the presence of genetic heterogeneity , 2010, BMC Genomics.

[2]  J. Pritchard,et al.  Linkage disequilibrium in humans: models and data. , 2001, American journal of human genetics.

[3]  David M. Reif,et al.  A comparison of analytical methods for genetic association studies , 2008, Genetic epidemiology.

[4]  R. Scully Epistatic Relationships in the BRCA1-BRCA2 Pathway , 2011, PLoS genetics.

[5]  M. Stephens,et al.  Bayesian variable selection regression for genome-wide association studies and other large-scale problems , 2011, 1110.6019.

[6]  Yan V. Sun,et al.  Machine learning in genome‐wide association studies , 2009, Genetic epidemiology.

[7]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[8]  Montgomery Slatkin,et al.  Linkage disequilibrium — understanding the evolutionary past and mapping the medical future , 2008, Nature Reviews Genetics.

[9]  Andrew D. Johnson,et al.  Bmc Medical Genetics an Open Access Database of Genome-wide Association Results , 2009 .

[10]  Andreas Ziegler,et al.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data , 2010, Bioinform..

[11]  David M. Reif,et al.  Machine Learning for Detecting Gene-Gene Interactions , 2006, Applied bioinformatics.

[12]  Qiang Yang,et al.  SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies , 2009, Bioinform..

[13]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[14]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[15]  A. Verschoren,et al.  HIGHER EPISTASIS IN GENETIC ALGORITHMS , 2008, Bulletin of the Australian Mathematical Society.

[16]  P. Chanda,et al.  AMBIENCE: A Novel Approach and Efficient Algorithm for Identifying Informative Genetic and Environmental Associations With Complex Phenotypes , 2008, Genetics.

[17]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[18]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[19]  Tao Peng,et al.  PBEAM: A parallel implementation of BEAM for genome-wide inference of epistatic interactions , 2009, Bioinformation.

[20]  Marylyn D. Ritchie,et al.  Grammatical Evolution of Neural Networks for Discovering Epistasis among Quantitative Trait Loci , 2010, EvoBIO.

[21]  Yongmei Liu,et al.  A ground truth based comparative study on detecting epistatic SNPs , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[22]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[23]  Guimei Liu,et al.  An empirical comparison of several recent epistatic interaction detection methods , 2011, Bioinform..

[24]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[25]  Mario Cortina-Borja,et al.  Epistasis in sporadic Alzheimer's disease , 2009, Neurobiology of Aging.

[26]  Shyam Visweswaran,et al.  Learning genetic epistasis using Bayesian network scoring criteria , 2011, BMC Bioinformatics.

[27]  Teri A Manolio,et al.  Genomewide association studies and assessment of the risk of disease. , 2010, The New England journal of medicine.

[28]  Andreas Ziegler,et al.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data , 2010, Bioinform..

[29]  Kristel Van Steen,et al.  Travelling the world of gene-gene interactions , 2012, Briefings Bioinform..

[30]  Xiang Zhang,et al.  Tools for efficient epistasis detection in genome-wide association study , 2010, Source Code for Biology and Medicine.