Grammatical Evolution of Neural Networks for Discovering Epistasis among Quantitative Trait Loci

Growing interest and burgeoning technology for discovering genetic mechanisms that influence disease processes have ushered in a flood of genetic association studies over the last decade, yet little heritability in highly studied complex traits has been explained by genetic variation. Non-additive gene-gene interactions, which are not often explored, are thought to be one source of this “missing” heritability. Here we present our assessment of the performance of grammatical evolution to evolve neural networks (GENN) for discovering gene-gene interactions which contribute to a quantitative heritable trait. We present several modifications to the GENN procedure which result in modest improvements in performance.

[1]  P Chambon,et al.  The cellular retinoic acid binding protein I is dispensable. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[2]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[3]  Annie E. Hill,et al.  Genetic architecture of complex traits: Large phenotypic effects and pervasive epistasis , 2008, Proceedings of the National Academy of Sciences.

[4]  D. Goldstein Common genetic variation and human traits. , 2009, The New England journal of medicine.

[5]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[6]  Jun Zhu,et al.  A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. , 2007, American journal of human genetics.

[7]  Jason H. Moore,et al.  Genetic Programming Neural Networks as a Bioinformatics Tool for Human Genetics , 2004, GECCO.

[8]  C. Babinet,et al.  Mice lacking vimentin develop and reproduce without an obvious phenotype , 1994, Cell.

[9]  Todd L Edwards,et al.  Exploring epistasis in candidate genes for rheumatoid arthritis , 2007, BMC proceedings.

[10]  Lance W. Hahn,et al.  Alternative cross-over strategies and selection techniques for grammatical evolution optimized neural networks , 2006, GECCO '06.

[11]  T. Baba,et al.  Sperm from mice carrying a targeted mutation of the acrosin gene can penetrate the oocyte zona pellucida and effect fertilization. , 1994, The Journal of biological chemistry.

[12]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[13]  Roland Linder,et al.  Microarray data classified by artificial neural networks. , 2007, Methods in molecular biology.

[14]  C. Carlson,et al.  Mapping complex disease loci in whole-genome association studies , 2004, Nature.

[15]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[16]  Marylyn D. Ritchie,et al.  Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA , 2008, EvoBIO.

[17]  Marylyn D. Ritchie,et al.  Conquering the Needle-in-a-Haystack: How Correlated Input Variables Beneficially Alter the Fitness Landscape for Neural Networks , 2009, EvoBIO.

[18]  J. Ott,et al.  Neural networks and disease association studies. , 2001, American journal of medical genetics.

[19]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[20]  D. Littman,et al.  Development and function of T cells in mice with a disrupted CD2 gene. , 1992, The EMBO journal.

[21]  Casey S. Greene,et al.  Failure to Replicate a Genetic Association May Provide Important Clues About Genetic Architecture , 2009, PloS one.

[22]  Lance W. Hahn,et al.  Comparison of Neural Network Optimization Approaches for Studies of Human Genetics , 2006, EvoWorkshops.

[23]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[24]  S K Durham,et al.  Expression of FosB during mouse development: normal development of FosB knockout mice. , 1996, Oncogene.

[25]  E. Crawford,et al.  Combining artificial neural networks and transrectal ultrasound in the diagnosis of prostate cancer. , 2003, Oncology.

[26]  Mike Schmidt,et al.  Statistical Applications in Genetics and Molecular Biology Extension of the SIMLA Package for Generating Pedigrees with Complex Inheritance Patterns : Environmental Covariates , Gene-Gene and Gene-Environment Interaction , 2011 .

[27]  M. LeBlanc,et al.  Increasing the power of identifying gene × gene interactions in genome‐wide association studies , 2008, Genetic epidemiology.

[28]  J. Hirschhorn Genomewide association studies--illuminating biologic pathways. , 2009, The New England journal of medicine.

[29]  D. Baker,et al.  Coupled prediction of protein secondary and tertiary structure , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  R. Bellman,et al.  V. Adaptive Control Processes , 1964 .

[31]  Susumu Tonegawa,et al.  T cell receptor δ gene mutant mice: Independent generation of αβ T cells and programmed rearrangements of γδ TCR genes , 1993, Cell.

[32]  Marylyn D Ritchie,et al.  Comparison of approaches for machine‐learning optimization of neural networks for detecting gene‐gene interactions in genetic epidemiology , 2008, Genetic epidemiology.

[33]  Sara A. Solla,et al.  Multi-Locus Nonparametric Linkage Analysis of Complex Trait Loci with Neural Networks , 1998, Human Heredity.

[34]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[35]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[36]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[37]  Alex Alves Freitas,et al.  Understanding the Crucial Role of Attribute Interaction in Data Mining , 2001, Artificial Intelligence Review.

[38]  Jason H. Moore,et al.  Application Of Genetic Algorithms To The Discovery Of Complex Models For Simulation Studies In Human Genetics , 2002, GECCO.

[39]  Marylyn D. Ritchie,et al.  Parallel multifactor dimensionality reduction: a tool for the large-scale analysis of gene-gene interactions , 2006, Bioinform..

[40]  Yutaka Shimada,et al.  Prediction of survival in patients with esophageal carcinoma using artificial neural networks , 2005, Cancer.

[41]  M. Ritchie,et al.  Methods for optimizing statistical analyses in pharmacogenomics research , 2009, Expert review of clinical pharmacology.

[42]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[43]  T. Reich,et al.  A perspective on epistasis: limits of models displaying no main effect. , 2002, American journal of human genetics.

[44]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[45]  Jason H. Moore,et al.  Symbolic discriminant analysis of microarray data in autoimmune disease , 2002, Genetic epidemiology.

[46]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[47]  Scott M. Williams,et al.  Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.