An Analysis of Selection in Genetic Programming

This thesis presents an analysis of the selection process in tree-based Genetic Programming (GP), covering the optimisation of both parent and offspring selection, and provides a detailed understanding of selection and guidance on how to improve GP search effectively and efficiently. The first part of the thesis provides models and visualisations to analyse selection behaviour in standard tournament selection, clarifies several issues in standard tournament selection, and presents a novel solution to automatically and dynamically optimise parent selection pressure. The fitness evaluation cost of parent selection is then addressed and some cost-saving algorithms introduced. In addition, the feasibility of using good predecessor programs to increase parent selection efficiency is analysed. The second part of the thesis analyses the impact of offspring selection pressure on the overall GP search performance. The fitness evaluation cost of offspring selection is then addressed, with investigation of some heuristics to efficiently locate good offspring by constraining crossover point selection structurally through the analysis of the characteristics of good crossover events. The main outcomes of the thesis are three new algorithms and four observations: 1) a clustering tournament selection method is developed to automatically and dynamically tune parent selection pressure; 2) a passive evaluation algorithm is introduced for reducing parent fitness evaluation cost for standard tournament selection using low selection pressure; 3) a heuristic population clustering algorithm is developed to reduce parent fitness evaluation cost while taking advantage of clustering tournament selection and avoiding the tournament size limitation; 4) population size has little impact on parent selection pressure thus the tournament size configuration is independent of population size; and different sampling replacement strategies have little impact on the selection behaviour in standard tournament selection; 5) premature convergence occurs more often when stochastic elements are removed from both parent and offspring selection processes; 6) good crossover events have a strong preference for whole program trees, and (less strongly) single-node or small subtrees that are at the bottom of parent program trees; 7) the ability of standard GP crossover to generate good offspring is far below what was expected.

[1]  Victor Ciesielski,et al.  A Domain-Independent Window Approach to Multiclass Object Detection Using Genetic Programming , 2003, EURASIP J. Adv. Signal Process..

[2]  Mengjie Zhang,et al.  Population Clustering in Genetic Programming , 2006, EuroGP.

[3]  Tatsuya Motoki,et al.  Calculating the Expected Loss of Diversity of Selection Schemes , 2002, Evolutionary Computation.

[4]  Richard M. Schwartz,et al.  On-line cursive handwriting recognition using speech recognition methods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[6]  Peter A. Whigham,et al.  Grammatically-based Genetic Programming , 1995 .

[7]  Andrea G. B. Tettamanzi,et al.  Genetic Programming without Fitness , 2007 .

[8]  Mengjie Zhang,et al.  GP for Object Classification: Brood Size in Brood Recombination Crossover , 2006, Australian Conference on Artificial Intelligence.

[9]  Kenneth A. De Jong,et al.  Understanding EA Dynamics via Population Fitness Distributions , 2003, GECCO.

[10]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[11]  D. Goldberg,et al.  Don't evaluate, inherit , 2001 .

[12]  Riccardo Poli,et al.  Elitism reduces bloat in genetic programming , 2008, GECCO '08.

[13]  P. Smith,et al.  Code growth, explicitly defined introns, and alternative selection schemes. , 1998, Evolutionary computation.

[14]  Mengjie Zhang,et al.  Another investigation on tournament selection: modelling and visualisation , 2007, GECCO '07.

[15]  Mengjie Zhang,et al.  An analysis of depth of crossover points in tree-based Genetic Programming , 2007, 2007 IEEE Congress on Evolutionary Computation.

[16]  Bangalore S. Manjunath,et al.  Genetic Programming for Object Detection , 1996 .

[17]  P. Nordin,et al.  Explicitly defined introns and destructive crossover in genetic programming , 1996 .

[18]  Prügel-Bennett,et al.  Analysis of genetic algorithms using statistical mechanics. , 1994, Physical review letters.

[19]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[21]  Mark Johnston,et al.  An analysis of multi-sampled issue and no-replacement tournament selection , 2008, GECCO '08.

[22]  Mengjie Zhang,et al.  Multiclass Object Classification Using Genetic Programming , 2004, EvoWorkshops.

[23]  Riccardo Poli,et al.  CES-480 Covariant Parsimony Pressure for Genetic Programming , 2008 .

[24]  Marco Tomassini,et al.  Soft computing - integrating evolutionary, neural, and fuzzy systems , 2001 .

[25]  Moshe Sipper,et al.  Evolutionary computation in medicine: an overview , 2000, Artif. Intell. Medicine.

[26]  K. Matsui New selection method to improve the population diversity in genetic algorithms , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[27]  David J. Montana,et al.  Automated hardware design using genetic programming, VHDL, and FPGAs , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[28]  Qingfu Zhang,et al.  An orthogonal genetic algorithm for multimedia multicast routing , 1999, IEEE Trans. Evol. Comput..

[29]  Sean Luke,et al.  Population Implosion in Genetic Programming , 2003, GECCO.

[30]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[31]  Riccardo Poli,et al.  Schema Theory for Genetic Programming with One-Point Crossover and Point Mutation , 1997, Evolutionary Computation.

[32]  Conor Ryan,et al.  Grammatical Evolution by Grammatical Evolution: The Evolution of Grammar and Genetic Code , 2004, EuroGP.

[33]  Alfonso Rodríguez-Patón,et al.  Grammar Based Crossover Operator in Genetic Programming , 2005, IWINAC.

[34]  T. Soule,et al.  Code Size and Depth Flows in Genetic Programming , 1997 .

[35]  R. Storn,et al.  Differential Evolution - A simple and efficient adaptive scheme for global optimization over continuous spaces , 2004 .

[36]  Samir W. Mahfoud Crowding and Preselection Revisited , 1992, PPSN.

[37]  J. Hammersley SIMULATION AND THE MONTE CARLO METHOD , 1982 .

[38]  David E. Goldberg,et al.  Genetic Algorithms, Selection Schemes, and the Varying Effects of Noise , 1996, Evolutionary Computation.

[39]  Mengjie Zhang,et al.  A New Crossover Operator in Genetic Programming for Object Classification , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Yuping Wang,et al.  An orthogonal genetic algorithm with quantization for global numerical optimization , 2001, IEEE Trans. Evol. Comput..

[41]  Dennis V. Lindley,et al.  An Introduction to Bayesian Inference and Decision , 1974 .

[42]  Man Leung Wong,et al.  Evolving recursive programs by using adaptive grammar based genetic programming , 2006, Genet. Program. Evolvable Mach..

[43]  Arthur K. Kordon,et al.  Variable Selection in Industrial Datasets Using Pareto Genetic Programming , 2006 .

[44]  Mengjie Zhang,et al.  SCHEME: Caching subtrees in genetic programming , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[45]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[46]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.

[47]  Stephen F. Smith,et al.  A learning system based on genetic adaptive algorithms , 1980 .

[48]  Mengjie Zhang,et al.  Probability Based Genetic Programming for Multiclass Object Classification , 2004, PRICAI.

[49]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[50]  Manuela M. Veloso,et al.  A Contolled Experiment: Evolution for Learning Difficult Image Classification , 1995, EPIA.

[51]  David E. Goldberg,et al.  Genetic Algorithms, Tournament Selection, and the Effects of Noise , 1995, Complex Syst..

[52]  Nguyen Xuan Hoai,et al.  A Brief Overview of Population Diversity Measures in Genetic Programming , 2006 .

[53]  Dana H. Ballard,et al.  Rooted-tree schemata in genetic programming , 1999 .

[54]  Vic Ciesielski,et al.  Representing classification problems in genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[55]  Jordan B. Pollack,et al.  Modeling Building-Block Interdependency , 1998, PPSN.

[56]  Wolfgang Banzhaf,et al.  Decreasing the Number of Evaluations in Evolutionary Algorithms by Using a Meta-model of the Fitness Function , 2003, EuroGP.

[57]  Byoung-Tak Zhang,et al.  Evolutionary Induction of Sparse Neural Trees , 1997, Evolutionary Computation.

[58]  Justinian P. Rosca,et al.  Causality in Genetic Programming , 1995, International Conference on Genetic Algorithms.

[59]  Leonardo Vanneschi,et al.  Dynamic Size Populations in Distributed Genetic Programming , 2005, EuroGP.

[60]  Reiko Tanese,et al.  Distributed Genetic Algorithms , 1989, ICGA.

[61]  Walter Alden Tackett,et al.  Recombination, selection, and the genetic construction of computer programs , 1994 .

[62]  John Galletly,et al.  Neural Networks: : An Introduction ‐ 2nd edition , 1998 .

[63]  Mengjie Zhang,et al.  Automatic Selection Pressure Control in Genetic Programming , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[64]  Conor Ryan,et al.  A Less Destructive, Context-Aware Crossover Operator for GP , 2006, EuroGP.

[65]  A. Topchy,et al.  Faster genetic programming based on local gradient search of numeric leaf values , 2001 .

[66]  Riccardo Poli,et al.  On the Search Properties of Different Crossover Operators in Genetic Programming , 2001 .

[67]  Dario Floreano,et al.  Measures of Diversity for Populations and Distances Between Individuals with Highly Reorganizable Genomes , 2004, Evolutionary Computation.

[68]  Julian Francis Miller,et al.  Cartesian genetic programming , 2010, GECCO.

[69]  Vic Ciesielski,et al.  Texture classifiers generated by genetic programming , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[70]  Malcolm I. Heywood,et al.  Directing crossover for reduction of bloat in GP , 2002, IEEE CCECE2002. Canadian Conference on Electrical and Computer Engineering. Conference Proceedings (Cat. No.02CH37373).

[71]  Mengjie Zhang,et al.  An analysis of the distribution of swapped subtree sizes in tree-based genetic programming , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[72]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[73]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[74]  Byoung-Tak Zhang,et al.  Evolving Optimal Neural Networks Using Genetic Algorithms with Occam's Razor , 1993, Complex Syst..

[75]  P. Sen,et al.  Introduction to bivariate and multivariate analysis , 1981 .

[76]  Malcolm I. Heywood,et al.  On Naïve Crossover Biases with Reproduction for Simple Solutions to Classification Problems , 2004, GECCO.

[77]  Róbert Ványi Practical Evaluation of Efficient Fitness Functions for Binary Images , 2005, EvoWorkshops.

[78]  Terence C. Fogarty,et al.  Comparison of steady state and generational genetic algorithms for use in nonstationary environments , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[79]  Riccardo Poli,et al.  Parallel Distributed Genetic Programming , 1996 .

[80]  Lee Spector,et al.  A Revised Comparison of Crossover and Mutation in Genetic Programming , 1998 .

[81]  Leon Poladian,et al.  Excluding the best and worst individuals from parent selection , 2007, 2007 IEEE Congress on Evolutionary Computation.

[82]  Mark Levene,et al.  An Introduction to Search Engines and Web Navigation (2. ed.) , 2005 .

[83]  L. Darrell Whitley,et al.  Unbiased tournament selection , 2005, GECCO '05.

[84]  Sean Luke,et al.  Fighting Bloat with Nonparametric Parsimony Pressure , 2002, PPSN.

[85]  Peter Nordin,et al.  Complexity Compression and Evolution , 1995, ICGA.

[86]  Leonardo Vanneschi,et al.  Theory and practice for efficient genetic programming , 2004 .

[87]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[88]  Leonardo Vanneschi,et al.  A new technique for dynamic size populations in genetic programming , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[89]  Byoung-Tak Zhang,et al.  Balancing Accuracy and Parsimony in Genetic Programming , 1995, Evolutionary Computation.

[90]  Kanta Premji Vekaria,et al.  Selective crossover as an adaptive strategy for genetic algorithms , 2000 .

[91]  E. Tsang,et al.  Reducing Failures In Investment Recommendations Using Genetic Programming , 2000 .

[92]  Mengjie Zhang,et al.  An analysis of constructive crossover and selection pressure in genetic programming , 2007, GECCO '07.

[93]  Heinz Mühlenbein,et al.  Predictive Models for the Breeder Genetic Algorithm I. Continuous Parameter Optimization , 1993, Evolutionary Computation.

[94]  Mengjie Zhang,et al.  Applying Online Gradient Descent Search to Genetic Programming for Object Recognition , 2004, ACSW.

[95]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[96]  Walter Alden Tackett,et al.  Genetic Programming for Feature Discovery and Image Discrimination , 1993, ICGA.

[97]  Riccardo Poli,et al.  Backward-chaining evolutionary algorithms , 2006, Artif. Intell..

[98]  Mark Johnston,et al.  Is the not-sampled issue in tournament selection critical? , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[99]  Mihai Oltean,et al.  Evolving Evolutionary Algorithms Using Multi Expression Programming , 2003, ECAL.