Bounding bloat in genetic programming

While many optimization problems work with a fixed number of decision variables and thus a fixed-length representation of possible solutions, genetic programming (GP) works on variable-length representations. A naturally occurring problem is that of bloat (unnecessary growth of solutions) slowing down optimization. Theoretical analyses could so far not bound bloat and required explicit assumptions on the magnitude of bloat. In this paper we analyze bloat in mutation-based genetic programming for the two test functions ORDER and MAJORITY. We overcome previous assumptions on the magnitude of bloat and give matching or close-to-matching upper and lower bounds for the expected optimization time. In particular, we show that the (1+1) GP takes (i) Θ(Tinit + n log n) iterations with bloat control on ORDER as well as MAJORITY; and (ii) O(Tinit log Tinit + n(log n)3) and Ω(Tinit + n log n) (and Ω(Tinit log Tinit) for n = 1) iterations without bloat control on MAJORITY.

[1]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[2]  Frank Neumann,et al.  PAC learning and genetic programming , 2011, GECCO '11.

[3]  Markus Wagner,et al.  Improved Computational Complexity Results for Weighted ORDER and MAJORITY , 2012 .

[4]  Markus Wagner,et al.  Single- and multi-objective genetic programming: new bounds for weighted order and majority , 2013, FOGA XII '13.

[5]  Daniel Johannsen,et al.  Random combinatorial structures and randomized search heuristics , 2010 .

[6]  Una-May O'Reilly,et al.  Program Search with a Hierarchical Variable Lenght Representation: Genetic Programming, Simulated Annealing and Hill Climbing , 1994, PPSN.

[7]  Una-May O'Reilly,et al.  Computational complexity analysis of simple genetic programming on two problems modeling isolated program semantics , 2010, FOGA '11.

[8]  Leslie Ann Goldberg,et al.  Adaptive Drift Analysis , 2011, Algorithmica.

[9]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[10]  Timo Kötzing Concentration of First Hitting Times Under Additive Drift , 2015, Algorithmica.

[11]  Pietro Simone Oliveto,et al.  On the Analysis of Simple Genetic Programming for Evolving Boolean Functions , 2016, EuroGP.

[12]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[13]  Una-May O'Reilly,et al.  An analysis of genetic programming , 1995 .

[14]  Timo Kötzing,et al.  Destructiveness of Lexicographic Parsimony Pressure and Alleviation by a Concatenation Crossover in Genetic Programming , 2018, PPSN.

[15]  Angelika Steger,et al.  Drift Analysis and Evolutionary Algorithms Revisited , 2016, Combinatorics, Probability and Computing.

[16]  Andrew M. Sutton,et al.  The max problem revisited: the importance of mutation in genetic programming , 2012, GECCO '12.

[17]  David E. Goldberg,et al.  Where Does the Good Stuff Go, and Why? How Contextual Semantics Influences Program Structure in Simple Genetic Programming , 1998, EuroGP.

[18]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.

[19]  Dirk Sudholt,et al.  The choice of the offspring population size in the (1,λ) EA , 2012, GECCO '12.

[20]  Frank Neumann,et al.  Computational complexity analysis of multi-objective genetic programming , 2012, GECCO '12.

[21]  Luca Manzoni,et al.  A comparison between geometric semantic GP and cartesian GP for boolean functions learning , 2014, GECCO.

[22]  Xin Yao,et al.  A study of drift analysis for estimating computation time of evolutionary algorithms , 2004, Natural Computing.

[23]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[24]  Carsten Witt,et al.  Tight Bounds on the Optimization Time of a Randomized Search Heuristic on Linear Functions† , 2013, Combinatorics, Probability and Computing.

[25]  Alberto Moraglio,et al.  Runtime analysis of mutation-based geometric semantic genetic programming on boolean functions , 2013, FOGA XII '13.