Repeated Patterns in Tree Genetic Programming

We extend our analysis of repetitive patterns found in genetic programming genomes to tree based GP. As in linear GP, repetitive patterns are present in large numbers. Size fair crossover limits bloat in automatic programming, preventing the evolution of recurring motifs. We examine these complex properties in detail: e.g. using depth v. size Catalan binary tree shape plots, subgraph and subtree matching, information entropy, syntactic and semantic fitness correlations and diffuse introns. We relate this emergent phenomenon to considerations about building blocks in GP and how GP works.

[1]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[2]  A. Smit,et al.  The origin of interspersed repeats in the human genome. , 1996, Current opinion in genetics & development.

[3]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[4]  Philippe Flajolet,et al.  An introduction to the analysis of algorithms , 1995 .

[5]  J. Lupski,et al.  Short, interspersed repetitive DNA sequences in prokaryotic genomes , 1992, Journal of bacteriology.

[6]  R. Britten,et al.  Repeated Sequences in DNA , 1968 .

[7]  Riccardo Poli,et al.  The evolution of size and shape , 1999 .

[8]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[9]  Una-May O'Reilly,et al.  The Troubling Aspects of a Building Block Hypothesis for Genetic Programming , 1994, FOGA.

[10]  W. B. Langdon,et al.  Genetic Programming and Data Structures , 1998, The Springer International Series in Engineering and Computer Science.

[11]  William B. Langdon,et al.  Size Fair and Homologous Tree Crossovers for Tree Genetic Programming , 2000, Genetic Programming and Evolvable Machines.

[12]  William B. Langdon,et al.  Genetic Programming in Data Mining for Drug Discovery , 2005 .

[13]  B. W.,et al.  Size Fair and Homologous Tree Genetic Programming Crossovers , 1999 .

[14]  William B. Langdon,et al.  Repeated Sequences in Linear Genetic Programming Genomes , 2005, Complex Syst..

[15]  Riccardo Poli,et al.  Foundations of Genetic Programming , 1999, Springer Berlin Heidelberg.

[16]  C Patience,et al.  Our retroviral heritage. , 1997, Trends in genetics : TIG.

[17]  Howard Oakley,et al.  Two scientific applications of genetic programming: Stack filters and non-linear equation fitting to , 1994 .

[18]  J. Jurka,et al.  Microsatellites in different eukaryotic genomes: survey and analysis. , 2000, Genome research.

[19]  Eric Coissac,et al.  Origin and fate of repeats in bacteria , 2002, Nucleic Acids Res..

[20]  J. K. Kinnear,et al.  Advances in Genetic Programming , 1994 .

[21]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .