Solving complex problems in human genetics using GP: challenges and opportunities

The development of rapid data-collection technologies is changing the biomedical and biological sciences. In human genetics chip-based methods facilitate the measurement of thousands of DNA sequence variations from across the human genome. The collection of genetic data is no longer a major rate limiting step. Instead the new challenges are the analysis and interpretation of these high dimensional and frequently noisy datasets. The specific challenge we are interested in is the identification of combinations of interacting DNA sequence variations predictive of common human diseases. Specifically, we wish to detect epistasis or gene-gene interactions. Here we focus solely on the situation where there is an epistatic effect but no detectable main effect. The challenge for applying search algorithms to this problem is that the accuracy of a model is not indicative of the quality of the attributes within the model. Instead we use pre-processing of the dataset to provide building blocks which enable our evolutionary computation strategy to discover an optimal model.

[1]  Vitorino Ramos,et al.  Artificial Ant Colonies in Digital Image Habitats - A Mass Behaviour Effect Study on Pattern Recognition , 2004, ArXiv.

[2]  Anthony Brabazon,et al.  GEVA: grammatical evolution in Java , 2008, SEVO.

[3]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[4]  Conor Ryan,et al.  Grammatical evolution , 2007, GECCO '07.

[5]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[6]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[7]  Lee Spector,et al.  Genetic Programming and Autoconstructive Evolution with the Push Programming Language , 2002, Genetic Programming and Evolvable Machines.

[8]  Matthew Cook,et al.  Universality in Elementary Cellular Automata , 2004, Complex Syst..

[9]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jason H. Moore,et al.  A statistical comparison of grammatical evolution strategies in the domain of human genetics , 2005, 2005 IEEE Congress on Evolutionary Computation.

[11]  Agostinho C. Rosa,et al.  KohonAnts - A Self-Organizing Ant Algorithm for Clustering and Pattern Classification , 2008, ALIFE.

[12]  M. Feder Robustness and Evolvability in Living Systems. Princeton Studies in Complexity.By Andreas Wagner. Princeton (New Jersey): Princeton University Press. $49.50. xv + 367 p; ill.; index. ISBN: 0–691–12240–7. 2005. , 2006 .

[13]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[14]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[15]  Bruce Edmonds,et al.  Meta-Genetic Programming: Co-evolving the Operators of Variation , 2001 .

[16]  L. Steels The Biology and Technology of Intelligent Autonomous Agents , 1995, NATO ASI Series.

[17]  Jason H. Moore,et al.  Tuning ReliefF for Genome-Wide Genetic Analysis , 2007, EvoBIO.

[18]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[19]  Jason H. Moore,et al.  Genome-Wide Analysis of Epistasis Using Multifactor Dimensionality Reduction: Feature Selection and Construction in the Domain of Human Genetics , 2009 .

[20]  David Martin,et al.  GEVOSH: Using Grammatical Evolution to Generate Hashing Functions , 2004, MAICS.

[21]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[22]  Jason H. Moore,et al.  Genome-Wide Genetic Analysis Using Genetic Programming: The Critical Need for Expert Knowledge , 2007 .

[23]  Michael O'Neill,et al.  Grammatical Evolution: Evolving Programs for an Arbitrary Language , 1998, EuroGP.

[24]  Robert Cleary,et al.  Extending Grammatical Evolution with Attribute Grammars: An Application to Knapsack Problems , 2005 .

[25]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[26]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.

[27]  Agostinho C. Rosa,et al.  Self-Regulated Artificial Ant Colonies on Digital Image Habitats , 2005, ArXiv.

[28]  Anthony Brabazon,et al.  Constant creation in grammatical evolution , 2007 .

[29]  C. Brooke Worth,et al.  The Insect Societies , 1973 .

[30]  Agostinho C. Rosa,et al.  Varying the Population Size of Artificial Foraging Swarms on Time Varying Landscapes , 2005, ICANN.

[31]  Saoirse Amarteifio,et al.  Interpreting a Genotype-Phenotype Map with Rich Representations in XMLGE , 2005 .

[32]  Anthony Brabazon,et al.  Recent Adventures in Grammatical Evolution , 2005 .

[33]  W. Langdon,et al.  Autoconstructive Evolution : Push , PushGP , and Pushpop , 2001 .

[34]  Anthony Brabazon,et al.  Grammatical Differential Evolution , 2006, IC-AI.

[35]  Jason H. Moore,et al.  Development and Evaluation of an Open-Ended Computational Evolution System for the Genetic Analysis of Susceptibility to Common Human Diseases , 2008, EvoBIO.

[36]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .

[37]  Jason H. Moore,et al.  Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases , 2008, Human Genetics.

[38]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[39]  Alfonso Ortega,et al.  Automatic composition of music by means of grammatical evolution , 2002, APL '02.

[40]  Charles Baudelaire,et al.  The Mirror of Art , 1956 .

[41]  Isaac E. Lagaris,et al.  Solving differential equations with genetic programming , 2006, Genetic Programming and Evolvable Machines.

[42]  Jason H. Moore,et al.  Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis , 2006, PPSN.

[43]  Una-May O'Reilly,et al.  Integrating generative growth and evolutionary computation for form exploration , 2007, Genetic Programming and Evolvable Machines.

[44]  Anthony Brabazon,et al.  A Grammatical Genetic Programming Approach to Modularity in Genetic Algorithms , 2007, EuroGP.

[45]  Bill C. White,et al.  Does Complexity Matter? Artificial Evolution, Computational Evolution and the Genetic Analysis of Epistasis in Common Human Diseases. , 2009 .

[46]  Catherine Wood,et al.  The pencil of nature , 2000, The Lancet.

[47]  Thomas Stützle,et al.  GECCO 2007: Genetic and Evolutionary Computation Conference , 2007 .

[48]  A. Brabazon,et al.  An Introduction to Evolutionary Computation in Finance , 2008, IEEE Computational Intelligence Magazine.

[49]  Jason H. Moore,et al.  Using expert knowledge in initialization for genome-wide analysis of epistasis using genetic programming , 2008, GECCO '08.

[50]  Manuel Cebrián,et al.  Automatic generation of benchmarks for plagiarism detection tools using grammatical evolution , 2007, GECCO '07.

[51]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[52]  John von Neumann,et al.  Theory Of Self Reproducing Automata , 1967 .

[53]  Jason H Moore,et al.  Computational analysis of gene-gene interactions using multifactor dimensionality reduction , 2004, Expert review of molecular diagnostics.

[54]  Anthony Brabazon,et al.  Grammatical Swarm: The generation of programs by social programming , 2006, Natural Computing.

[55]  Jason H. Moore,et al.  An Expert Knowledge-Guided Mutation Operator for Genome-Wide Genetic Analysis Using Genetic Programming , 2007, PRIB.

[56]  Alex Alves Freitas,et al.  Understanding the Crucial Role of Attribute Interaction in Data Mining , 2001, Artificial Intelligence Review.

[57]  A. Wagner Robustness and Evolvability in Living Systems , 2005 .

[58]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[59]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[60]  Jason H. Moore,et al.  Petri net modeling of high-order genetic systems using grammatical evolution. , 2003, Bio Systems.

[61]  Anthony Brabazon,et al.  Biologically inspired algorithms for financial modelling , 2006, Natural computing series.

[62]  Anthony Brabazon,et al.  Evolving a logo design using Lindenmayer systems, Postscript & Grammatical Evolution , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[63]  Ulya R. Karpuzcu Automatic verilog code generation through grammatical evolution , 2005, GECCO '05.