Epistasis analysis using artificial intelligence.

Here we introduce artificial intelligence (AI) methodology for detecting and characterizing epistasis in genetic association studies. The ultimate goal of our AI strategy is to analyze genome-wide genetics data as a human would using sources of expert knowledge as a guide. The methodology presented here is based on computational evolution, which is a type of genetic programming. The ability to generate interesting solutions while at the same time learning how to solve the problem at hand distinguishes computational evolution from other genetic programming approaches. We provide a general overview of this approach and then present a few examples of its application to real data.

[1]  Ting Hu,et al.  Characterizing genetic interactions in human disease association studies using statistical epistasis networks , 2011, BMC Bioinformatics.

[2]  Jason H. Moore,et al.  Symbolic discriminant analysis of microarray data in autoimmune disease , 2002, Genetic epidemiology.

[3]  Jason H. Moore,et al.  Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases , 2008, Human Genetics.

[4]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[5]  Marylyn D. Ritchie,et al.  ATHENA: A knowledge-based hybrid backpropagation-grammatical evolution neural network algorithm for discovering epistasis among quantitative trait Loci , 2010, BioData Mining.

[6]  Jason H. Moore,et al.  An Open-Ended Computational Evolution Strategy for Evolving Parsimonious Solutions to Human Genetics Problems , 2009, ECAL.

[7]  Jason H. Moore,et al.  Human Microbiome Visualization Using 3d Technology , 2011, Pacific Symposium on Biocomputing.

[8]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[9]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[10]  Joshua L. Payne,et al.  Exploiting Expert Knowledge of Protein-Protein Interactions in a Computational Evolution System for Detecting Epistasis , 2011 .

[11]  Bill C White,et al.  Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases , 2003, BMC Bioinformatics.

[12]  Jason H. Moore,et al.  Environmental Sensing of Expert Knowledge in a Computational Evolution System for Complex Problem Solving in Human Genetics , 2010 .

[13]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[14]  Andrew J. Saykin,et al.  Exploring Interestingness in a Computational Evolution System for the Genome-Wide Genetic Analysis of Alzheimer's Disease , 2013, GPTP.

[15]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[16]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[17]  Joshua L. Payne,et al.  Sensible Initialization of a Computational Evolution System Using Expert Knowledge for Epistasis Analysis in Human Genetics , 2010 .

[18]  Jason H. Moore,et al.  Genetic programming neural networks: A powerful bioinformatics tool for human genetics , 2007, Appl. Soft Comput..

[19]  Jiang Gui,et al.  Symbolic Modeling of Epistasis , 2007, Human Heredity.

[20]  Jason H. Moore,et al.  Genetic Analysis of Prostate Cancer Using Computational Evolution, Pareto-Optimization and Post-processing , 2013 .

[21]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[22]  Jason H. Moore,et al.  Cross Validation Consistency for the Assessment of Genetic Programming Results in Microarray Studies , 2003, EvoWorkshops.

[23]  Gregory Hornby,et al.  ALPS: the age-layered population structure for reducing the problem of premature convergence , 2006, GECCO.

[24]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[25]  David Corne,et al.  Evolutionary Computation In Bioinformatics , 2003 .

[26]  D. Blacker,et al.  Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database , 2007, Nature Genetics.

[27]  Mark Kotanchek,et al.  Better Solutions Faster: Soft Evolution of Robust Regression Models InParetogeneticprogramming , 2008 .

[28]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[29]  Jason H. Moore,et al.  Human-Computer Interaction in a Computational Evolution System for the Genetic Analysis of Cancer , 2011 .

[30]  Jason H. Moore,et al.  Development and Evaluation of an Open-Ended Computational Evolution System for the Genetic Analysis of Susceptibility to Common Human Diseases , 2008, EvoBIO.

[31]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[32]  Mark Kotanchek,et al.  Pareto-Front Exploitation in Symbolic Regression , 2005 .

[33]  Jeffrey Heer,et al.  A tour through the visualization zoo , 2010, Commun. ACM.

[34]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.