An improved semantic schema modeling for genetic programming

A considerable research effort has been performed recently to improve the power of genetic programming (GP) by accommodating semantic awareness. The semantics of a tree implies its behavior during the execution. A reliable theoretical modeling of GP should be aware of the behavior of individuals. Schema theory is a theoretical tool used to model the distribution of the population over a set of similar points in the search space, referred by schema. There are several major issues with relying on prior schema theories, which define schemata in syntactic level. Incorporating semantic awareness in schema theory has been scarcely studied in the literature. In this paper, we present an improved approach for developing the semantic schema in GP. The semantics of a tree is interpreted as the normalized mutual information between its output vector and the target. A new model of the semantic search space is introduced according to semantics definition, and the semantic building block space is presented as an intermediate space between semantic and genotype ones. An improved approach is provided for representing trees in building block space. The presented schema is characterized by Poisson distribution of trees in this space. The corresponding schema theory is developed for predicting the expected number of individuals belonging to proposed schema, in the next generation. The suggested schema theory provides new insight on the relation between syntactic and semantic spaces. It has been shown to be efficient in comparison with the existing semantic schema, in both generalization and diversity-preserving aspects. Experimental results also indicate that the proposed schema is much less computationally expensive than the similar work.

[1]  William B. Langdon,et al.  Repeated Sequences in Linear Genetic Programming Genomes , 2005, Complex Syst..

[2]  Riccardo Poli,et al.  Foundations of Genetic Programming , 1999, Springer Berlin Heidelberg.

[3]  Riccardo Poli,et al.  Exact Schema Theorem and Effective Fitness for GP with One-Point Crossover , 2000, GECCO.

[4]  Krzysztof Krawiec The framework of behavioral program synthesis , 2016 .

[5]  Riccardo Poli,et al.  Using Schema Theory To Explore Interactions Of Multiple Operators , 2002, GECCO.

[6]  Leonardo Vanneschi,et al.  A survey of semantic methods in genetic programming , 2014, Genetic Programming and Evolvable Machines.

[7]  Justinian P. Rosca,et al.  Genetic Programming Exploratory Power and the Discovery of Functions , 1995, Evolutionary Programming.

[8]  Asoke K. Nandi,et al.  Adapted Geometric Semantic Genetic programming for diabetes and breast cancer classification , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[9]  Michael O'Neill,et al.  Predicting the Tide with Genetic Programming and Semantic-based Crossovers , 2010 .

[10]  Alberto Moraglio,et al.  Runtime analysis of mutation-based geometric semantic genetic programming for basis functions regression , 2013, GECCO '13.

[11]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[12]  Nguyen Xuan Hoai,et al.  Estimating the distribution and propagation of genetic programming building blocks through tree compression , 2009, GECCO.

[13]  Chilukuri K. Mohan,et al.  Towards an Information Theoretic Framework for Genetic Programming , 2008 .

[14]  Leonardo Vanneschi,et al.  The influence of population size in geometric semantic GP , 2017, Swarm Evol. Comput..

[15]  Walter Alden Tackett,et al.  Mining the Genetic Program , 1995, IEEE Expert.

[16]  Ahmed Kattan,et al.  Using semantics in the selection mechanism in Genetic Programming: A simple method for promoting semantic diversity , 2013, 2013 IEEE Congress on Evolutionary Computation.

[17]  Hussein A. Abbass,et al.  A Survey of Probabilistic Model Building Genetic Programming , 2006, Scalable Optimization via Probabilistic Modeling.

[18]  Leonardo Vanneschi,et al.  Genetic programming needs better benchmarks , 2012, GECCO '12.

[19]  Hammad Majeed,et al.  A new approach to evaluate GP schema in context , 2005, GECCO '05.

[20]  Leonardo Vanneschi,et al.  Geometric Semantic Genetic Programming for Real Life Applications , 2013, GPTP.

[21]  Krzysztof Krawiec,et al.  Semantic Backpropagation for Designing Search Operators in Genetic Programming , 2015, IEEE Transactions on Evolutionary Computation.

[22]  Alessio Fumagalli,et al.  An evolutionary system for exploitation of fractured geothermal reservoirs , 2016, Computational Geosciences.

[23]  Nicholas Freitag McPhee,et al.  Semantic Building Blocks in Genetic Programming , 2008, EuroGP.

[24]  Mengjie Zhang,et al.  Empirical Analysis of GP Tree-Fragments , 2007, EuroGP.

[25]  Krzysztof Krawiec,et al.  Approximating geometric crossover in semantic space , 2009, GECCO.

[26]  David E. Goldberg,et al.  BUILDING-BLOCK SUPPLY IN GENETIC PROGRAMMING , 2003 .

[27]  Leonardo Vanneschi,et al.  A New Implementation of Geometric Semantic GP and Its Application to Problems in Pharmacokinetics , 2013, EuroGP.

[28]  Colin G. Johnson,et al.  Semantic analysis of program initialisation in genetic programming , 2009, Genetic Programming and Evolvable Machines.

[29]  Riccardo Poli,et al.  General Schema Theory for Genetic Programming with Subtree-Swapping Crossover , 2001, EuroGP.

[30]  David Jackson,et al.  Promoting Phenotypic Diversity in Genetic Programming , 2010, PPSN.

[31]  Michael O'Neill,et al.  Examining the Diversity Property of Semantic Similarity Based Crossover , 2013, EuroGP.

[32]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[33]  David Jackson,et al.  Phenotypic Diversity in Initial Genetic Programming Populations , 2010, EuroGP.

[34]  Thomas Back Proceedings of the Seventh International Conference on Genetic Algorithms: Michigan State University, East Lansing, MI, July 19-23, 1997 , 1997 .

[35]  Colin G. Johnson,et al.  Semantically driven mutation in genetic programming , 2009, 2009 IEEE Congress on Evolutionary Computation.

[36]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[37]  Malcolm I. Heywood,et al.  Context-Based Repeated Sequences in Linear Genetic Programming , 2005, EuroGP.

[38]  Michael O'Neill,et al.  On the roles of semantic locality of crossover in genetic programming , 2013, Inf. Sci..

[39]  William B. Langdon,et al.  Repeated patterns in genetic programming , 2008, Natural Computing.

[40]  Leonardo Vanneschi,et al.  Prediction of the Unified Parkinson's Disease Rating Scale assessment using a genetic programming system with geometric semantic genetic operators , 2014, Expert Syst. Appl..

[41]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[42]  R. Moddemeijer On estimation of entropy and mutual information of continuous distributions , 1989 .

[43]  Justinian P. Rosca,et al.  Causality in Genetic Programming , 1995, International Conference on Genetic Algorithms.

[44]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[45]  Michael O'Neill,et al.  Genetic Programming and Evolvable Machines Manuscript No. Semantically-based Crossover in Genetic Programming: Application to Real-valued Symbolic Regression , 2022 .

[46]  Mohammad Mehdi Ebadzadeh,et al.  Semantic schema theory for genetic programming , 2015, Applied Intelligence.

[47]  Bart Wyns,et al.  Characterizing Diversity in Genetic Programming , 2006, EuroGP.

[48]  G. W. Snedecor Statistical Methods , 1964 .

[49]  Krzysztof Krawiec,et al.  Semantic Geometric Initialization , 2016, EuroGP.

[50]  Riccardo Poli,et al.  Exact Schema Theorems for GP with One-Point and Standard Crossover Operating on Linear Structures and Their Application to the Study of the Evolution of Size , 2001, EuroGP.

[51]  Leonardo Vanneschi,et al.  A Study of Fitness Distance Correlation as a Difficulty Measure in Genetic Programming , 2005, Evolutionary Computation.

[52]  Colin G. Johnson,et al.  Semantically driven crossover in genetic programming , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[53]  Krzysztof Krawiec,et al.  Locally geometric semantic crossover: a study on the roles of semantics and homology in recombination operators , 2012, Genetic Programming and Evolvable Machines.

[54]  Riccardo Poli,et al.  General Schema Theory for Genetic Programming with Subtree-Swapping Crossover: Part I , 2003, Evolutionary Computation.

[55]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[56]  Leonardo Vanneschi,et al.  A C++ framework for geometric semantic genetic programming , 2014, Genetic Programming and Evolvable Machines.

[57]  Nguyen Xuan Hoai,et al.  Subtree semantic geometric crossover for genetic programming , 2015, Genetic Programming and Evolvable Machines.

[58]  Riccardo Poli,et al.  Exact Schema Theory and Markov Chain Models for Genetic Programming and Variable-length Genetic Algorithms with Homologous Crossover , 2004, Genetic Programming and Evolvable Machines.

[59]  Lee Altenberg,et al.  The Schema Theorem and Price's Theorem , 1994, FOGA.

[60]  Maarten Keijzer,et al.  Improving Symbolic Regression with Interval Arithmetic and Linear Scaling , 2003, EuroGP.

[61]  Riccardo Poli,et al.  Schema Theory for Genetic Programming with One-Point Crossover and Point Mutation , 1997, Evolutionary Computation.

[62]  Mark Johnston,et al.  Analysis of Building Blocks with Numerical Simplification in Genetic Programming , 2010, EuroGP.

[63]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[64]  Krzysztof Krawiec,et al.  Approximating geometric crossover by semantic backpropagation , 2013, GECCO '13.

[65]  Michael O'Neill,et al.  Examining the landscape of semantic similarity based mutation , 2011, GECCO '11.

[66]  P.A. Whigham,et al.  A Schema Theorem for context-free grammars , 1995, Proceedings of 1995 IEEE International Conference on Evolutionary Computation.

[67]  Riccardo Poli,et al.  Hyperschema Theory for GP with One-Point Crossover, Building Blocks, and Some New Results in GA Theory , 2000, EuroGP.

[68]  Tomasz Pawlak,et al.  Geometric Semantic Genetic Programming Is Overkill , 2016, EuroGP.

[69]  Mengjie Zhang,et al.  Empirical analysis of schemata in Genetic Programming using maximal schemata and MSG , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[70]  Una-May O'Reilly,et al.  The Troubling Aspects of a Building Block Hypothesis for Genetic Programming , 1994, FOGA.

[71]  Krzysztof Krawiec,et al.  Geometric Semantic Genetic Programming , 2012, PPSN.

[72]  Mohammad Mehdi Ebadzadeh,et al.  Estimation of mutual information by the fuzzy histogram , 2014, Fuzzy Optimization and Decision Making.

[73]  Graham Kendall,et al.  Sampling of Unique Structures and Behaviours in Genetic Programming , 2004, EuroGP.