Semantics Based Substituting Technique for Reducing Code Bloat in Genetic Programming

Genetic Programming (GP) is a technique that allows computer programs encoded as a set of tree structures to be evolved using an evolutionary algorithm. In GP, code bloat is a common phenomenon characterized by the size of individuals gradually increasing during the evolution. This phenomenon has a negative impact on GP performance in solving problems. In order to address this problem, we have recently introduced a code bloat control method based on semantics: Substituting a subtree with an Approximate Terminal (SAT-GP). In this paper, we propose an extension of SAT-GP, namely Substituting a subtree with an Approximate Subprogram (SAS-GP). We tested this method with different GP parameter settings on a real-world time series forecasting problem. The experimental results demonstrate the benefit of the proposed method in reducing the code bloat phenomenon and improving GP performance. Particularly, SAS-GP often achieves the best performance compared to other tested GP systems using four popular performance metrics in GP.

[1]  Krzysztof Krawiec,et al.  Geometric Semantic Genetic Programming , 2012, PPSN.

[2]  Riccardo Poli,et al.  Operator Equalisation and Bloat Free GP , 2008, EuroGP.

[3]  Anuradha Purohit,et al.  Code Bloat Problem in Genetic Programming , 2013 .

[4]  Quang Uy Nguyen,et al.  Reducing code bloat in Genetic Programming based on subtree substituting technique , 2017, 2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES).

[5]  Leonardo Vanneschi,et al.  Operator equalisation for bloat free genetic programming and a survey of bloat control methods , 2011, Genetic Programming and Evolvable Machines.

[6]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[7]  Juan Julián Merelo Guervós,et al.  Prune and Plant: A New Bloat Control Method for Genetic Programming , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[8]  Marc Parizeau,et al.  Controlling code growth by dynamically shaping the genotype size distribution , 2015, Genetic Programming and Evolvable Machines.

[9]  Ernesto Costa,et al.  Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories , 2009, Genetic Programming and Evolvable Machines.

[10]  Sara Silva,et al.  Extending Operator Equalisation: Fitness Based Self Adaptive Length Distribution for Bloat Free GP , 2009, EuroGP.

[11]  Riccardo Poli,et al.  A Simple but Theoretically-Motivated Method to Control Bloat in Genetic Programming , 2003, EuroGP.

[12]  Sean Luke,et al.  A Comparison of Bloat Control Methods for Genetic Programming , 2006, Evolutionary Computation.

[13]  Leonardo Vanneschi,et al.  A survey of semantic methods in genetic programming , 2014, Genetic Programming and Evolvable Machines.

[14]  John R. Koza,et al.  Genetic programming as a means for programming computers by natural selection , 1994 .

[15]  Leonardo Vanneschi,et al.  Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction , 2009, GECCO.

[16]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[17]  Michael O'Neill,et al.  Tournament Selection Based on Statistical Test in Genetic Programming , 2016, PPSN.

[18]  Michael O'Neill,et al.  Semantic tournament selection for genetic programming based on statistical analysis of error vectors , 2018, Inf. Sci..

[19]  Luis Muñoz,et al.  neat Genetic Programming: Controlling bloat naturally , 2016, Inf. Sci..

[20]  Riccardo Poli,et al.  Crossover Operators For A Hardware Implementation Of GP Using FPGAs And Handel-C , 2002, GECCO.

[21]  David L. Dowe,et al.  Message Length as an Effective Ockham's Razor in Decision Tree Induction , 2001, International Conference on Artificial Intelligence and Statistics.

[22]  Leonardo Vanneschi,et al.  The Importance of Being Flat–Studying the Program Length Distributions of Operator Equalisation , 2011 .

[23]  Tony Belpaeme Evolution of Visual Feature Detectors , 1999 .

[24]  Luca Citi,et al.  Memory with Memory in Tree-Based Genetic Programming , 2009, EuroGP.

[25]  Grant Dick,et al.  Implicitly Controlling Bloat in Genetic Programming , 2010, IEEE Transactions on Evolutionary Computation.

[26]  Kumar Chellapilla,et al.  Data mining using genetic programming: the implications of parsimony on generalization error , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[27]  Michael O'Neill,et al.  Predicting the Tide with Genetic Programming and Semantic-based Crossovers , 2010, 2010 Second International Conference on Knowledge and Systems Engineering.

[28]  Tony Belpaeme Evolving Visual Feature Detectors , 1999, ECAL.

[29]  Alexandros Agapitos,et al.  Genetic Programming for the Induction of Seasonal Forecasts: A Study on Weather-derivatives , 2012 .