Scalable genetic programming by gene-pool optimal mixing and input-space entropy-based building-block learning

The Gene-pool Optimal Mixing Evolutionary Algorithm (GOMEA) is a recently introduced model-based EA that has been shown to be capable of outperforming state-of-the-art alternative EAs in terms of scalability when solving discrete optimization problems. One of the key aspects of GOMEA's success is a variation operator that is designed to extensively exploit linkage models by effectively combining partial solutions. Here, we bring the strengths of GOMEA to Genetic Programming (GP), introducing GP-GOMEA. Under the hypothesis of having little problem-specific knowledge, and in an effort to design easy-to-use EAs, GP-GOMEA requires no parameter specification. On a set of well-known benchmark problems we find that GP-GOMEA outperforms standard GP while being on par with more recently introduced, state-of-the-art EAs. We furthermore introduce Input-space Entropy-based Building-block Learning (IEBL), a novel approach to identifying and encapsulating relevant building blocks (subroutines) into new terminals and functions. On problems with an inherent degree of modularity, IEBL can contribute to compact solution representations, providing a large potential for knock-on effects in performance. On the difficult, but highly modular Even Parity problem, GP-GOMEA+IEBL obtains excellent scalability, solving the 14-bit instance in less than 1 hour.

[1]  Antonina Starita,et al.  An Analysis of Automatic Subroutine Discovery in Genetic Programming , 1999, GECCO.

[2]  Conor Ryan,et al.  Run Transferable Libraries - Learning Functional Bias in Problem Domains , 2004, GECCO.

[3]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[4]  Krzysztof Krawiec,et al.  Semantic Backpropagation for Designing Search Operators in Genetic Programming , 2015, IEEE Transactions on Evolutionary Computation.

[5]  Richard Mark Downing,et al.  Evolving binary decision diagrams using implicit neutrality , 2005, 2005 IEEE Congress on Evolutionary Computation.

[6]  D. Goldberg,et al.  Probabilistic Model Building and Competent Genetic Programming , 2003 .

[7]  Fernando G. Lobo,et al.  Java Implementation of a Parameter-less Evolutionary Portfolio , 2015, ArXiv.

[8]  Shlomo Moran,et al.  Optimal implementations of UPGMA and other common clustering algorithms , 2007, Inf. Process. Lett..

[9]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[10]  Emin Erkan Korkmaz,et al.  Design and Usage of a New Benchmark Problem for Genetic Programming , 2003, ISCIS.

[11]  J. Pollack,et al.  Coevolving High-Level Representations , 1993 .

[12]  Dirk Thierens,et al.  Hierarchical problem solving with the linkage tree genetic algorithm , 2013, GECCO '13.

[13]  David Jackson,et al.  Layered Learning in Boolean GP Problems , 2007, EuroGP.

[14]  Gisele L. Pappa,et al.  Sequential Symbolic Regression with Genetic Programming , 2015, GPTP.

[15]  S. Morishita On Classi cation and Regression , 1998 .

[16]  Thomas Jansen,et al.  A building-block royal road where crossover is provably essential , 2007, GECCO '07.

[17]  John R. Koza,et al.  Evolving Modules in Genetic Programming by Subtree Encapsulation , 2001, EuroGP.

[18]  Marc Schoenauer,et al.  Memetic Semantic Genetic Programming , 2015, GECCO.

[19]  D. Goldberg,et al.  A Survey of Linkage Learning Techniques in Genetic and Evolutionary Algorithms , 2007 .