An Analysis of the Causes of Code Growth in Genetic Programming

This research examines the cause of code growth (bloat) in genetic programming (GP). Currently there are three hypothesized causes of code growth in GP: protection, drift, and removal bias. We show that single node mutations increase code growth in evolving programs. This is strong evidence that the protective hypothesis is correct. We also show a negative correlation between the size of the branch removed during crossover and the resulting change in fitness, but a much weaker correlation for added branches. These results support the removal bias hypothesis, but seem to refute the drift hypothesis. Our results also suggest that there are serious disadvantages to the tree structured programs commonly evolved with GP, because the nodes near the root are effectively fixed in the very early generations.

[1]  John R. Koza,et al.  A genetic approach to the truck backer upper problem and the inter-twined spiral problem , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[2]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[3]  Lothar Thiele,et al.  Genetic Programming and Redundancy , 1994 .

[4]  Peter Nordin,et al.  Complexity Compression and Evolution , 1995, ICGA.

[5]  Nicholas Freitag McPhee,et al.  Accurate Replication in Genetic Programming , 1995, ICGA.

[6]  P. Nordin,et al.  Explicitly defined introns and destructive crossover in genetic programming , 1996 .

[7]  Terence Soule,et al.  Code growth in genetic programming , 1996 .

[8]  Peter Ross,et al.  Tackling the Boolean Even N Parity Problem with Genetic Programming and Limited-Error Fitness , 1997 .

[9]  William B. Langdon,et al.  Fitness Causes Bloat: Simulated Annealing, Hill Climbing and Populations , 1997 .

[10]  T. Soule,et al.  Code Size and Depth Flows in Genetic Programming , 1997 .

[11]  Peter Nordin,et al.  Evolutionary program induction of binary machine code and its applications , 1997 .

[12]  Peter Nordin,et al.  Introns in Nature and in Simulated Structure Evolution , 1997, BCEC.

[13]  Riccardo Poli,et al.  Why Ants are Hard , 1998 .

[14]  Terence Soule,et al.  Removal bias: a new cause of code growth in tree based evolutionary programming , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[15]  P. Smith,et al.  Code growth, explicitly defined introns, and alternative selection schemes. , 1998, Evolutionary computation.

[16]  B. W.,et al.  Size Fair and Homologous Tree Genetic Programming Crossovers , 1999 .

[17]  K. Chellapilla,et al.  Investigating the influence of depth and degree of genotypic change on fitness in genetic programming , 1999 .

[18]  Thomas Sterling,et al.  How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters 2nd Printing , 1999 .

[19]  N. Hopper,et al.  Analysis of genetic diversity through population history , 1999 .

[20]  Riccardo Poli,et al.  The evolution of size and shape , 1999 .

[21]  David Spector,et al.  Building Linux Clusters with Cdrom , 2000 .

[22]  David H. M. Spector,et al.  Building Linux clusters - scaling Linux for scientific and enterprise applications , 2000 .

[23]  Benjamin Ray Seyfarth,et al.  How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters , 2000, Scalable Comput. Pract. Exp..

[24]  Sean Luke,et al.  Code Growth Is Not Caused by Introns , 2000 .

[25]  Terence Soule,et al.  Exons and Code Growth in Genetic Programming , 2002, EuroGP.

[26]  Jason M. Daida,et al.  What Makes a Problem GP-Hard? Analysis of a Tunably Difficult Problem in Genetic Programming , 1999, Genetic Programming and Evolvable Machines.