ATHENA Optimization: The Effect of Initial Parameter Settings across Different Genetic Models

Rapidly advancing technology has allowed for the generation of massive amounts data assessing variation across the human genome. One analysis method for this type of data is the genome-wide association study (GWAS) where each variation is assessed individually for association to disease. While these studies have elucidated novel etiology, much of the variation due to genetics remains unexplained. One hypothesis is that some of the variation lies in gene-gene interactions. An impediment to testing for interactions is the infeasibility of exhaustively searching all multi-locus models. Novel methods are being developed that perform a non-exhaustive search. Because these methods are new to genetic studies, rigorous parameter optimization is necessary. Here, we assess genotype encodings, function sets, and cross-over in two algorithms which use grammatical evolution to optimize neural networks or symbolic regression formulas in the ATHENA software package. Our results show that the effect of these parameters is highly dependent on the underlying disease model.

[1]  Marylyn D. Ritchie,et al.  Initialization parameter sweep in ATHENA: optimizing neural networks for detecting gene-gene interactions in the presence of small main effects , 2010, GECCO '10.

[2]  Jason H. Moore,et al.  Symbolic discriminant analysis of microarray data in autoimmune disease , 2002, Genetic epidemiology.

[3]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[4]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[5]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[6]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[7]  Bill C White,et al.  Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases , 2003, BMC Bioinformatics.

[8]  Marylyn D. Ritchie,et al.  Data Simulation Software for Whole-Genome Association and Other Studies in Human Genetics , 2005, Pacific Symposium on Biocomputing.

[9]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[10]  Marylyn D. Ritchie,et al.  Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA , 2008, EvoBIO.

[11]  Lance W. Hahn,et al.  Alternative cross-over strategies and selection techniques for grammatical evolution optimized neural networks , 2006, GECCO '06.

[12]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[13]  Marylyn D. Ritchie,et al.  Pacific Symposium on Biocomputing 14:368-379 (2009) BIOFILTER: A KNOWLEDGE-INTEGRATION SYSTEM FOR THE MULTI-LOCUS ANALYSIS OF GENOME-WIDE ASSOCIATION STUDIES * , 2022 .

[14]  Marylyn D. Ritchie,et al.  ATHENA: A knowledge-based hybrid backpropagation-grammatical evolution neural network algorithm for discovering epistasis among quantitative trait Loci , 2010, BioData Mining.

[15]  Jiang Gui,et al.  Symbolic Modeling of Epistasis , 2007, Human Heredity.

[16]  Marylyn D. Ritchie,et al.  Grammatical Evolution of Neural Networks for Discovering Epistasis among Quantitative Trait Loci , 2010, EvoBIO.

[17]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[18]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[19]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[20]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[21]  W. Bateson Mendel's Principles of Heredity , 1910, Nature.

[22]  M. O'Neill,et al.  Grammatical evolution , 2001, GECCO '09.

[23]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[24]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[25]  J. Ott,et al.  Neural networks and disease association studies. , 2001, American journal of medical genetics.

[26]  David M. Reif,et al.  Novel methods for detecting epistasis in pharmacogenomics studies. , 2007, Pharmacogenomics.

[27]  Jason H. Moore,et al.  Complex Function Sets Improve Symbolic Discriminant Analysis of Microarray Data , 2003, GECCO.

[28]  Neural Networks for Pattern Recognitionby , 2022 .

[29]  A. Krogh What are artificial neural networks? , 2008, Nature Biotechnology.

[30]  Marylyn D Ritchie,et al.  Comparison of approaches for machine‐learning optimization of neural networks for detecting gene‐gene interactions in genetic epidemiology , 2008, Genetic epidemiology.