Recursion-Based Biases in Stochastic Grammar Model Genetic Programming

The estimation of distribution algorithms (EDAs) applied to genetic programming (GP) have been studied by a number of authors. Like all EDAs, they suffer from biases induced by the model building and sampling process. However, the biases are amplified in the algorithms for GP. In particular, many systems use stochastic grammars as their model representation, but biases arise due to grammar recursion. We define and estimate the bias due to recursion in grammar-based EDAs in GP, using methods derived from computational linguistics. We confirm the extent of bias in some simple experimental examples. We then propose some methods to repair this bias. We apply the estimation of bias, and its repair, to some more practical applications. We experimentally demonstrate the extent of bias arising from recursion, and the performance improvements that can result from correcting it.

[1]  Nicholas Freitag McPhee,et al.  On the Strength of Size Limits in Linear Genetic Programming , 2004, GECCO.

[2]  Taylor L. Booth,et al.  Applying Probability Measures to Abstract Languages , 1973, IEEE Transactions on Computers.

[3]  Daryl Essam,et al.  Modularity and position independence in EDA-GP , 2004 .

[4]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[5]  Ben Goertzel,et al.  Learning computer programs with the bayesian optimization algorithm , 2005, GECCO '05.

[6]  Hitoshi Iba,et al.  Estimation of distribution algorithm based on probabilistic grammar with latent annotations , 2007, 2007 IEEE Congress on Evolutionary Computation.

[7]  D. Goldberg,et al.  Probabilistic Model Building and Competent Genetic Programming , 2003 .

[8]  Hitoshi Iba,et al.  A Bayesian Network Approach to Program Generation , 2008, IEEE Transactions on Evolutionary Computation.

[9]  Peter A. Whigham,et al.  Grammar-based Genetic Programming: a survey , 2010, Genetic Programming and Evolvable Machines.

[10]  Peter A. Whigham,et al.  Grammatical bias for evolutionary learning , 1996 .

[11]  Janet Clegg Combining cartesian genetic programming with an estimation of distribution algorithm , 2008, GECCO '08.

[12]  Zhiyi Chi,et al.  Statistical Properties of Probabilistic Context-Free Grammars , 1999, CL.

[13]  Athanasios Tsakonas,et al.  A comparison of classification accuracy of four genetic programming-evolved intelligent structures , 2006, Inf. Sci..

[14]  Yin Shan Program distribution estimation with grammar models , 2004 .

[15]  Peter A. Whigham,et al.  Grammatically-based Genetic Programming , 1995 .

[16]  Peter A. Whigham Inductive bias and genetic programming , 1995 .

[17]  Hussein A. Abbass,et al.  AntTAG: a new method to compose computer programs using colonies of ants , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[18]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[19]  Riccardo Poli,et al.  A Linear Estimation-of-Distribution GP System , 2008, EuroGP.

[20]  Christian Keber,et al.  Option Valuation With Generalized Ant Programming , 2002, GECCO.

[21]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[22]  Ernesto Costa,et al.  Resource-Limited Genetic Programming: Replacing Tree Depth Limits , 2005 .

[23]  Tomoharu Nagao,et al.  Dynamic ant programming for automatic construction of programs , 2008 .

[24]  Joseph A. O'Sullivan,et al.  Entropies and combinatorics of random branching processes and context-free languages , 1992, IEEE Trans. Inf. Theory.

[25]  N. McPhee,et al.  The Effects of Size and Depth Limits on Tree Based Genetic Programming , 2006 .

[26]  Hussein A. Abbass,et al.  Program evolution with explicit learning , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[27]  Nguyen Xuan Hoai,et al.  Implicit bias and recursive grammar structures in estimation of distribution genetic programming , 2012, 2012 IEEE Congress on Evolutionary Computation.

[28]  Erik D. Goodman,et al.  The royal tree problem, a benchmark for single and multiple population genetic programming , 1996 .

[29]  Sergio A. Rojas,et al.  A Grid-based Ant Colony System for Automatic Program Synthesis , 2004 .

[30]  Hussein A. Abbass,et al.  A Survey of Probabilistic Model Building Genetic Programming , 2006, Scalable Optimization via Probabilistic Modeling.

[31]  Zbigniew J. Czech,et al.  Solving Approximation Problems by Ant Colony Programming , 2002, GECCO Late Breaking Papers.

[32]  Zhiyi Chi,et al.  Estimation of Probabilistic Context-Free Grammars , 1998, Comput. Linguistics.

[33]  Rafal Salustowicz,et al.  Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.

[34]  Hussein A. Abbass,et al.  Grammar model-based program evolution , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[35]  Brian J. Ross,et al.  Logic-based genetic programming with definite clause translation grammars , 1999, New Generation Computing.

[36]  Hitoshi Iba,et al.  Latent Variable Model for Estimation of Distribution Algorithm Based on a Probabilistic Context-Free Grammar , 2009, IEEE Transactions on Evolutionary Computation.

[37]  Hitoshi Iba,et al.  Applied Genetic Programming and Machine Learning , 2009 .

[38]  Sebastián Ventura,et al.  A grammar based Ant Programming algorithm for mining classification rules , 2010, IEEE Congress on Evolutionary Computation.

[39]  Robert I. McKay,et al.  Stochastic Diversity Loss and Scalability in Estimation of Distribution Genetic Programming , 2013, IEEE Transactions on Evolutionary Computation.

[40]  Hussein A. Abbass,et al.  Program Evolution with Explicit Learning: a New Framework for Program Automatic Synthesis , 2003 .

[41]  W. Langdon An Analysis of the MAX Problem in Genetic Programming , 1997 .

[42]  Nguyen Xuan Hoai,et al.  Probabilistic model building in genetic programming: a critical review , 2013, Genetic Programming and Evolvable Machines.

[43]  Hitoshi Iba,et al.  Probabilistic distribution models for EDA-based GP , 2005, GECCO '05.