A New Epistasis Detecting Algorithm Based on Ant Colony Optimization

The rapid developments of chip-based technology have greatly improved human genetics and made routine the access of thousands of single nucleotide polymorphisms (SNPs) contributing to an informatics challenge. The characterization and interpretation of genes and gene-gene interactions that affect the susceptibility of common, complex multifactorial diseases is a computational and statistical challenge in genome-wide association studies (GWAS). Various methods have been proposed, but they have difficulty to be directly applied to GWAS caused by excessive search space and intensive computational burden. In this paper, we propose an ant colony optimization (ACO) based algorithm by combining the pheromone updating rule with the heuristic information. We tested power performance of our algorithm by conducting sufficient experiments including a wide range of simulated datasets experiments and a real genome-wide dataset experiment. Experimental results demonstrate that our algorithm is time efficient and gain good performance in the term of the power of prediction accuracy.

[1]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[2]  Liang Han,et al.  CChi: An efficient cloud epistasis test model in human genome wide association studies , 2013, 2013 6th International Conference on Biomedical Engineering and Informatics.

[3]  Marylyn D. Ritchie,et al.  Multilocus Analysis of Hypertension: A Hierarchical Approach , 2004, Human Heredity.

[4]  Laurent Briollais,et al.  SNP-SNP interactions in breast cancer susceptibility , 2006, BMC Cancer.

[5]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[6]  Qiang Yang,et al.  MegaSNPHunter: a learning approach to detect disease predisposition SNPs and high level interactions in genome wide association study , 2009, BMC Bioinformatics.

[7]  Alex A. Freitas,et al.  An ant colony based system for data mining: applications to medical data , 2001 .

[8]  D. Botstein,et al.  Construction of a genetic linkage map in man using restriction fragment length polymorphisms. , 1980, American journal of human genetics.

[9]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[10]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[11]  Jing Zhang,et al.  BLOCK-BASED BAYESIAN EPISTASIS ASSOCIATION MAPPING WITH APPLICATION TO WTCCC TYPE 1 DIABETES DATA. , 2011, The annals of applied statistics.

[12]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[13]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[14]  Jason H. Moore,et al.  Ant Colony Optimization for Genome-Wide Genetic Analysis , 2008, ANTS Conference.

[15]  R. Jiang,et al.  Epistatic Module Detection for Case-Control Studies: A Bayesian Model with a Gibbs Sampling Strategy , 2009, PLoS genetics.

[16]  M. Dorigo,et al.  1 Positive Feedback as a Search Strategy , 1991 .

[17]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[18]  David V Conti,et al.  A testing framework for identifying susceptibility genes in the presence of epistasis. , 2006, American journal of human genetics.