Characterizing the Effects of Random Subsampling on Lexicase Selection

Lexicase selection is a proven parent-selection algorithm designed for genetic programming problems, especially for uncompromising test-based problems where many distinct test cases must all be passed. Previous work has shown that random subsampling techniques can improve lexicase selection’s problem-solving success; here, we investigate why. We test two types of random subsampling lexicase variants: down-sampled lexicase, which uses a random subset of all training cases each generation; and cohort lexicase, which collects candidate solutions and training cases into small groups for testing, reshuffling those groups each generation. We show that both of these subsampling lexicase variants improve problem-solving success by facilitating deeper evolutionary searches; that is, they allow populations to evolve for more generations (relative to standard lexicase) given a fixed number of test-case evaluations. We also demonstrate that the subsampled variants require less computational effort to find solutions, even though subsampling hinders lexicase’s ability to preserve specialists. Contrary to our expectations, we did not find any evidence of systematic loss of phenotypic diversity maintenance due to subsampling, though we did find evidence that cohort lexicase is significantly better at preserving phylogenetic diversity than down-sampled lexicase.

[1]  Lee Spector,et al.  Lexicase Selection Beyond Genetic Programming , 2018, GPTP.

[2]  Lee Spector,et al.  Solving Uncompromising Problems With Lexicase Selection , 2015, IEEE Transactions on Evolutionary Computation.

[3]  Jason H. Moore,et al.  A Probabilistic and Multi-Objective Analysis of Lexicase Selection and ε-Lexicase Selection , 2019, Evolutionary Computation.

[4]  Jared M. Moore,et al.  The Limits of Lexicase Selection in an Evolutionary Robotics Task , 2019 .

[5]  Charles Ofria,et al.  Quantifying the tape of life: Ancestry-based metrics provide insights and intuition about evolutionary dynamics , 2018, PeerJ Prepr..

[6]  Lee Spector,et al.  Epsilon-Lexicase Selection for Regression , 2016, GECCO.

[7]  Lee Spector,et al.  Tag-based modules in genetic programming , 2011, GECCO '11.

[8]  Vinicius Veloso de Melo,et al.  Batch tournament selection for genetic programming: the quality of lexicase, the speed of tournament , 2019, GECCO.

[9]  Jared M. Moore,et al.  Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers , 2017, ECAL.

[10]  Peter Ross,et al.  Dynamic Training Subset Selection for Supervised Learning in Genetic Programming , 1994, PPSN.

[11]  Lee Spector,et al.  General Program Synthesis Benchmark Suite , 2015, GECCO.

[12]  Jared M. Moore,et al.  Tiebreaks and Diversity: Isolating Effects in Lexicase Selection , 2018, ALIFE.

[13]  Charles Ofria,et al.  Ecological theory provides insights about evolutionary computation , 2018, GECCO.

[14]  David Fagan,et al.  Towards Understanding and Refining the General Program Synthesis Benchmark Suite with Genetic Programming , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[15]  Charles Ofria,et al.  Evolving event-driven programs with SignalGP , 2018, GECCO.

[16]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[17]  Charles Ofria,et al.  Gene duplications drive the evolution of complex traits and regulation , 2017, ECAL.

[18]  Leonardo Trujillo,et al.  A comparison of fitness-case sampling methods for genetic programming , 2017, J. Exp. Theor. Artif. Intell..

[19]  Lee Spector,et al.  Lexicase selection in learning classifier systems , 2019, GECCO.

[20]  Lee Spector,et al.  Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report , 2012, GECCO '12.

[21]  Campbell O. Webb,et al.  Exploring the Phylogenetic Structure of Ecological Communities: An Example for Rain Forest Trees , 2000, The American Naturalist.

[22]  Malcolm I. Heywood,et al.  Towards Efficient Training on Large Datasets for Genetic Programming , 2004, Canadian AI.

[23]  Lee Spector,et al.  Effects of Lexicase and Tournament Selection on Diversity Recovery and Maintenance , 2016, GECCO.

[24]  Charles Ofria,et al.  Random subsampling improves performance in lexicase selection , 2019, GECCO.

[25]  Amel Borgi,et al.  Sampling Methods in Genetic Programming Learners from Large Datasets: A Comparative Study , 2016, INNS Conference on Big Data.

[26]  Charles Ofria,et al.  Tag-accessed memory for genetic programming , 2019, GECCO.

[27]  Ivo Gonçalves,et al.  Random Sampling Technique for Overfitting Control in Genetic Programming , 2012, EuroGP.

[28]  Lee Spector,et al.  Lexicase selection of specialists , 2019, GECCO.

[29]  Lee Spector,et al.  Relaxations of Lexicase Parent Selection , 2017, GPTP.