Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning

We present a Quality-Diversity benchmark suite for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. We specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, including coverage, QD-score, maximum fitness, and an archive profile metric to quantify the relation between coverage and fitness. We also present how to quantify the robustness of the solutions with respect to environmental stochasticity by introduc-ing corrected versions of the same metrics. We believe that our benchmark is a valuable tool for the community to compare and improve their findings. The source code is available online 1 .

[1]  Matthew C. Fontaine,et al.  Approximating gradients for differentiable quality diversity in reinforcement learning , 2022, GECCO.

[2]  Antoine Cully,et al.  Policy gradient assisted MAP-Elites , 2021, GECCO.

[3]  Olivier Bachem,et al.  Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation , 2021, NeurIPS Datasets and Benchmarks.

[4]  Stefanos Nikolaidis,et al.  Differentiable Quality Diversity , 2021, NeurIPS.

[5]  Antoine Cully,et al.  Quality-Diversity Optimization: a novel branch of stochastic optimization , 2020, Black Box Optimization, Machine Learning, and No-Free Lunch Theorems.

[6]  Julian Togelius,et al.  Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network , 2020, AAAI.

[7]  Antoine Cully,et al.  Fast and stable MAP-Elites in noisy domains using deep grids , 2020, ALIFE.

[8]  Antoine Cully,et al.  Diversity policy gradient for sample efficient quality-diversity optimization , 2020, GECCO.

[9]  Kenneth O. Stanley,et al.  First return, then explore , 2020, Nature.

[10]  Jean-Baptiste Mouret,et al.  Discovering representations for black-box optimization , 2020, GECCO.

[11]  J. Clune,et al.  Scaling MAP-Elites to deep neuroevolution , 2020, GECCO.

[12]  Sebastian Risi,et al.  MAP-Elites for noisy domains by adaptive sampling , 2019, GECCO.

[13]  Sebastian Risi,et al.  Deep neuroevolution of recurrent and discrete world models , 2019, GECCO.

[14]  Julian Togelius,et al.  Mapping hearthstone deck spaces through MAP-elites with sliding boundaries , 2019, GECCO.

[15]  Jean-Baptiste Mouret,et al.  Data-Efficient Design Exploration through Surrogate-Assisted Illumination , 2018, Evolutionary Computation.

[16]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[17]  Yiannis Demiris,et al.  Quality and Diversity Optimization: A Unifying Modular Framework , 2017, IEEE Transactions on Evolutionary Computation.

[18]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[19]  Jean-Baptiste Mouret,et al.  Using Centroidal Voronoi Tessellations to Scale Up the Multidimensional Archive of Phenotypic Elites Algorithm , 2016, IEEE Transactions on Evolutionary Computation.

[20]  Jean-Baptiste Mouret,et al.  Reset-free Trial-and-Error Learning for Robot Damage Recovery , 2016, Robotics Auton. Syst..

[21]  Kenneth O. Stanley,et al.  Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.

[22]  Jean-Baptiste Mouret,et al.  Illuminating search spaces by mapping elites , 2015, ArXiv.

[23]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[25]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[26]  Antoine Cully,et al.  Behavioral repertoire learning in robotics , 2013, GECCO '13.

[27]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Antoine Cully,et al.  Accelerated Quality-Diversity for Robotics through Massive Parallelism , 2022, ArXiv.

[29]  Black Box Optimization, Machine Learning, and No-Free Lunch Theorems , 2021, Springer Optimization and Its Applications.

[30]  Boris Katz,et al.  ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.