CMDR based differential evolution identifies the epistatic interaction in genome‐wide association studies

Motivation: Detecting epistatic interactions in genome‐wide association studies (GWAS) is a computational challenge. Such huge numbers of single‐nucleotide polymorphism (SNP) combinations limit the some of the powerful algorithms to be applied to detect the potential epistasis in large‐scale SNP datasets. Approach: We propose a new algorithm which combines the differential evolution (DE) algorithm with a classification based multifactor‐dimensionality reduction (CMDR), termed DECMDR. DECMDR uses the CMDR as a fitness measure to evaluate values of solutions in DE process for scanning the potential statistical epistasis in GWAS. Results: The results indicated that DECMDR outperforms the existing algorithms in terms of detection success rate by the large simulation and real data obtained from the Wellcome Trust Case Control Consortium. For running time comparison, DECMDR can efficient to apply the CMDR to detect the significant association between cases and controls amongst all possible SNP combinations in GWAS. Availability and Implementation: DECMDR is freely available at https://goo.gl/p9sLuJ. Contact: chuang@isu.edu.tw or e0955767257@yahoo.com.tw Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Taesung Park,et al.  New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis , 2009, Bioinform..

[2]  Alexander Rakitko Multifactorial Dimensionality Reduction for Disordered Trait , 2015, BIOINFORMATICS.

[3]  J. D. Ramos,et al.  Multifactor‐dimensionality reduction reveals interaction of important gene variants involved in allergy , 2015, International journal of immunogenetics.

[4]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.

[5]  P. N. Suganthan,et al.  Differential Evolution: A Survey of the State-of-the-Art , 2011, IEEE Transactions on Evolutionary Computation.

[6]  Uday K. Chakraborty,et al.  Advances in Differential Evolution , 2010 .

[7]  Junying Zhang,et al.  EpiSIM: simulation of multiple epistasis, linkage disequilibrium patterns and haplotype blocks for genome-wide interaction analysis , 2013, Genes & Genomics.

[8]  Clement Adebamowo,et al.  A genome-wide association study of breast cancer in women of African ancestry , 2012, Human Genetics.

[9]  Moshe Sipper,et al.  Evolving artificial neural networks with FINCH , 2013, GECCO '13 Companion.

[10]  C. F. Jeff Wu,et al.  Experiments: Planning, Analysis, and Parameter Design Optimization , 2000 .

[11]  Lotfi Chouchane,et al.  Genome-Wide Association Studies (GWAS) breast cancer susceptibility loci in Arabs: susceptibility and prognostic implications in Tunisians , 2012, Breast Cancer Research and Treatment.

[12]  R. Storn,et al.  Differential Evolution - A simple and efficient adaptive scheme for global optimization over continuous spaces , 2004 .

[13]  P. N. Suganthan,et al.  Differential Evolution Algorithm With Strategy Adaptation for Global Numerical Optimization , 2009, IEEE Transactions on Evolutionary Computation.

[14]  J. van Leeuwen,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[15]  Matthew J. Liberatore,et al.  Review of Hierarchical operations and supply chain planning by Tan Miller, Springer-Verlag 2001 , 2003 .

[16]  René Thomsen,et al.  A comparative study of differential evolution, particle swarm optimization, and evolutionary algorithms on numerical benchmark problems , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[17]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[18]  Scott M. Williams,et al.  Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[19]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[20]  H. Iwase,et al.  [Breast cancer]. , 2006, Nihon rinsho. Japanese journal of clinical medicine.

[21]  R. Fisher XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. , 1919, Transactions of the Royal Society of Edinburgh.

[22]  Yuhong Yang CONSISTENCY OF CROSS VALIDATION FOR COMPARING REGRESSION PROCEDURES , 2007, 0803.2963.

[23]  Marylyn D. Ritchie,et al.  Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction , 2008, BMC Bioinformatics.

[24]  Jason H. Moore,et al.  A global view of epistasis , 2005, Nature Genetics.

[25]  A. Ashworth,et al.  Genome-wide association study identifies a novel variant in RAD51B associated with male breast cancer risk , 2012, Nature Genetics.

[26]  Jason H. Moore,et al.  An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene Interactions on risk of myocardial infarction: The importance of model validation , 2004, BMC Bioinformatics.

[27]  Li-Yeh Chuang,et al.  MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction , 2013, PloS one.

[28]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[29]  Ausif Mahmood,et al.  Differential Evolution: A Survey and Analysis , 2018, Applied Sciences.

[30]  Anbupalam Thalamuthu,et al.  A Genome-wide Association Scan on Estrogen Receptor-negative Breast Cancer , 2022 .

[31]  Marylyn D. Ritchie,et al.  Parallel multifactor dimensionality reduction: a tool for the large-scale analysis of gene-gene interactions , 2006, Bioinform..

[32]  Scott M. Williams,et al.  challenges for genome-wide association studies , 2010 .

[33]  Patrick Neven,et al.  Comparison of 6q25 Breast Cancer Hits from Asian and European Genome Wide Association Studies in the Breast Cancer Association Consortium (BCAC) , 2012, PloS one.

[34]  Rainer Storn,et al.  Differential Evolution Research – Trends and Open Questions , 2008 .

[35]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[36]  Li-Yeh Chuang,et al.  A systematic gene-gene and gene-environment interaction analysis of DNA repair genes XRCC1, XRCC2, XRCC3, XRCC4, and oral cancer risk. , 2015, Omics : a journal of integrative biology.

[37]  Li-Yeh Chuang,et al.  High Order Gene-Gene Interactions in Eight Single Nucleotide Polymorphisms of Renin-Angiotensin System Genes for Hypertension Association Study , 2015, BioMed research international.

[38]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[39]  Jason H. Moore,et al.  GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures , 2012, BioData Mining.

[40]  Sverker Holmgren,et al.  A Flexible Computational Framework Using R and Map-Reduce for Permutation Tests of Massive Genetic Analysis of Complex Traits , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[41]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[42]  W. Bateson Mendel's Principles of Heredity , 1910, Nature.

[43]  田原 康玄,et al.  生活習慣病とgenome-wide association study , 2015 .

[44]  Alison A Motsinger-Reif,et al.  Power of grammatical evolution neural networks to detect gene-gene interactions in the presence of error , 2008, BMC Research Notes.

[45]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[46]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[47]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[48]  Jason H. Moore,et al.  Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS , 2010, Bioinform..