Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies

With the development of hardware accelerators and their corresponding tools, evaluations have become more affordable through fast and massively parallel evaluations in some applications. This advancement has drastically sped up the runtime of evolution-inspired algorithms such as Quality-Diversity optimization, creating tremendous potential for algorithmic innovation through scale. In this work, we propose MAP-Elites-Multi-ES (MEMES), a novel QD algorithm based on Evolution Strategies (ES) designed for fast parallel evaluations. ME-Multi-ES builds on top of the existing MAP-Elites-ES algorithm, scaling it by maintaining multiple independent ES threads with massive parallelization. We also introduce a new dynamic reset procedure for the lifespan of the independent ES to autonomously maximize the improvement of the QD population. We show experimentally that MEMES outperforms existing gradient-based and objective-agnostic QD algorithms when compared in terms of generations. We perform this comparison on both black-box optimization and QD-Reinforcement Learning tasks, demonstrating the benefit of our approach across different problems and domains. Finally, we also find that our approach intrinsically enables optimization of fitness locally around a niche, a phenomenon not observed in other QD algorithms.

[1]  R. Lange evosax: JAX-Based Evolution Strategies , 2022, GECCO Companion.

[2]  Antoine Cully,et al.  Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in Hard Exploration Problems , 2022, ArXiv.

[3]  Simón C. Smith,et al.  Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning , 2022, ArXiv.

[4]  Antoine Cully,et al.  Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains , 2022, ACM Trans. Evol. Learn. Optim..

[5]  Matthew C. Fontaine,et al.  Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing , 2022, ArXiv.

[6]  Yingtao Tian,et al.  EvoJAX: hardware-accelerated neuroevolution , 2022, GECCO Companion.

[7]  Matthew C. Fontaine,et al.  Approximating gradients for differentiable quality diversity in reinforcement learning , 2022, GECCO.

[8]  Miles Macklin,et al.  Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning , 2021, NeurIPS Datasets and Benchmarks.

[9]  Antoine Cully,et al.  Policy gradient assisted MAP-Elites , 2021, GECCO.

[10]  Olivier Bachem,et al.  Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation , 2021, NeurIPS Datasets and Benchmarks.

[11]  Stefanos Nikolaidis,et al.  Differentiable Quality Diversity , 2021, NeurIPS.

[12]  Antoine Cully,et al.  Diversity policy gradient for sample efficient quality-diversity optimization , 2020, GECCO.

[13]  Kenneth O. Stanley,et al.  First return, then explore , 2020, Nature.

[14]  Jean-Baptiste Mouret,et al.  Discovering representations for black-box optimization , 2020, GECCO.

[15]  J. Clune,et al.  Scaling MAP-Elites to deep neuroevolution , 2020, GECCO.

[16]  Amy K. Hoover,et al.  Covariance matrix adaptation for the rapid illumination of behavior space , 2019, GECCO.

[17]  Jean-Baptiste Mouret,et al.  Are quality diversity algorithms better at generating stepping stones than objective-based search? , 2019, GECCO.

[18]  Julian Togelius,et al.  Procedural Content Generation through Quality Diversity , 2019, 2019 IEEE Conference on Games (CoG).

[19]  Jean-Baptiste Mouret,et al.  Data-Efficient Design Exploration through Surrogate-Assisted Illumination , 2018, Evolutionary Computation.

[20]  Jean-Baptiste Mouret,et al.  Discovering the elite hypervolume by leveraging interspecies correlation , 2018, GECCO.

[21]  Kenneth O. Stanley,et al.  Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.

[22]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[23]  Yiannis Demiris,et al.  Quality and Diversity Optimization: A Unifying Modular Framework , 2017, IEEE Transactions on Evolutionary Computation.

[24]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[25]  Jean-Baptiste Mouret,et al.  Reset-free Trial-and-Error Learning for Robot Damage Recovery , 2016, Robotics Auton. Syst..

[26]  Jean-Baptiste Mouret,et al.  Illuminating search spaces by mapping elites , 2015, ArXiv.

[27]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[28]  Antoine Cully,et al.  Behavioral repertoire learning in robotics , 2013, GECCO '13.

[29]  Kenneth O. Stanley,et al.  Evolving a diversity of virtual creatures through novelty search and local competition , 2011, GECCO '11.

[30]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[31]  Antoine Cully,et al.  Accelerated Quality-Diversity for Robotics through Massive Parallelism , 2022, ArXiv.

[32]  Kenneth O. Stanley,et al.  Novelty Search and the Problem with Objectives , 2011 .

[33]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.