The Problem Solving Benefits of Down-sampling Vary by Selection Scheme

Genetic programming systems often use large training sets to evaluate candidate solutions, which can be computationally expensive. Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains. Indeed, recent studies have shown that both random and informed down-sampling can substantially improve problem-solving success for GP systems that use lexicase parent selection. We use the PushGP framework to experimentally test whether these down-sampling techniques can also improve problem-solving success in the context of two other commonly used selection methods, fitness-proportionate and tournament selection, across eight GP problems (four program synthesis and four symbolic regression). We verified that down-sampling can benefit the problem-solving success of both fitness-proportionate and tournament selection. However, the number of problems wherein down-sampling improved problem-solving success varied by selection scheme, suggesting that the impact of down-sampling depends both on the problem and choice of selection scheme. Surprisingly, we found that down-sampling was most consistently beneficial when combined with lexicase selection as compared to tournament and fitness-proportionate selection. Overall, our results suggest that down-sampling should be considered more often when solving test-based GP problems.

[1]  C. Ofria,et al.  Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving , 2023, ArXiv.

[2]  Franz Rothlauf,et al.  Effects of the Training Set Size: A Comparison of Standard and Down-Sampled Lexicase Selection in Program Synthesis , 2022, 2022 IEEE Congress on Evolutionary Computation (CEC).

[3]  Emily L. Dolson,et al.  Untangling phylogenetic diversity's role in evolutionary computation using a suite of diagnostic fitness landscapes , 2022, GECCO Companion.

[4]  Thomas Helmuth,et al.  Applying genetic programming to PSB2: the next generation program synthesis benchmark suite , 2022, Genetic Programming and Evolvable Machines.

[5]  L. Spector,et al.  The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection , 2022, ArXiv.

[6]  Jose Guadalupe Hernandez,et al.  A suite of diagnostic metrics for characterizing selection schemes , 2022, ArXiv.

[7]  Emily L. Dolson,et al.  What can phylogenetic metrics tell us about useful diversity in evolutionary algorithms? , 2021, Genetic and Evolutionary Computation.

[8]  Charles Ofria,et al.  An Exploration of Exploration: Measuring the ability of lexicase selection to find obscure pathways to optimality , 2021, Genetic and Evolutionary Computation.

[9]  Lee Spector,et al.  Problem-Solving Benefits of Down-Sampled Lexicase Selection , 2021, Artificial Life.

[10]  Thomas Helmuth,et al.  PSB2: the second program synthesis benchmark suite , 2021, GECCO.

[11]  Charles Ofria,et al.  Random subsampling improves performance in lexicase selection , 2019, GECCO.

[12]  Charles Ofria,et al.  Ecological theory provides insights about evolutionary computation , 2018, GECCO.

[13]  Lee Spector,et al.  Solving Uncompromising Problems With Lexicase Selection , 2015, IEEE Transactions on Evolutionary Computation.

[14]  Lee Spector,et al.  General Program Synthesis Benchmark Suite , 2015, GECCO.

[15]  Gregory Hornby,et al.  ALPS: the age-layered population structure for reducing the problem of premature convergence , 2006, GECCO.

[16]  Lee Spector,et al.  Genetic Programming and Autoconstructive Evolution with the Push Programming Language , 2002, Genetic Programming and Evolvable Machines.

[17]  Lee Spector,et al.  Lexicase Selection for Program Synthesis: A Diversity Analysis , 2016 .

[18]  Jeffrey Horn,et al.  Handbook of evolutionary computation , 1997 .