Autonomous skill discovery with quality-diversity and unsupervised descriptors

Quality-Diversity optimization is a new family of optimization algorithms that, instead of searching for a single optimal solution to solving a task, searches for a large collection of solutions that all solve the task in a different way. This approach is particularly promising for learning behavioral repertoires in robotics, as such a diversity of behaviors enables robots to be more versatile and resilient. However, these algorithms require the user to manually define behavioral descriptors, which is used to determine whether two solutions are different or similar. The choice of a behavioral descriptor is crucial, as it completely changes the solution types that the algorithm derives. In this paper, we introduce a new method to automatically define this descriptor by combining Quality-Diversity algorithms with unsupervised dimensionality reduction algorithms. This approach enables robots to autonomously discover the range of their capabilities while interacting with their environment. The results from two experimental scenarios demonstrate that robot can autonomously discover a large range of possible behaviors, without any prior knowledge about their morphology and environment. Furthermore, these behaviors are deemed to be similar to hand-crafted solutions that uses domain knowledge and significantly more diverse than when using existing unsupervised methods.

[1]  R. Miikkulainen,et al.  Learning Behavior Characterizations for Novelty Search , 2016, GECCO.

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Antoine Cully,et al.  Evolving a Behavioral Repertoire for a Walking Robot , 2013, Evolutionary Computation.

[4]  Jean-Baptiste Mouret,et al.  Using Centroidal Voronoi Tessellations to Scale Up the Multidimensional Archive of Phenotypic Elites Algorithm , 2016, IEEE Transactions on Evolutionary Computation.

[5]  Kenneth O. Stanley,et al.  Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.

[6]  Stéphane Doncieux,et al.  Beyond black-box optimization: a review of selective pressures for evolutionary robotics , 2014, Evol. Intell..

[7]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[8]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[9]  Jean-Baptiste Mouret,et al.  Illuminating search spaces by mapping elites , 2015, ArXiv.

[10]  Julian Togelius,et al.  Talakat: bullet hell generation through constrained map-elites , 2018, GECCO.

[11]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[12]  Jean-Baptiste Mouret,et al.  Data-Efficient Design Exploration through Surrogate-Assisted Illumination , 2018, Evolutionary Computation.

[13]  Hod Lipson,et al.  Resilient Machines Through Continuous Self-Modeling , 2006, Science.

[14]  Kenneth O. Stanley,et al.  Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.

[15]  Kenneth O. Stanley,et al.  Confronting the Challenge of Quality Diversity , 2015, GECCO.

[16]  Shuicheng Yan,et al.  Robust LSTM-Autoencoders for Face De-Occlusion in the Wild , 2016, IEEE Transactions on Image Processing.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  Hod Lipson,et al.  Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding , 2013, GECCO '13.

[19]  Kyrre Glette,et al.  Real-world evolution adapts robot morphology and control to hardware limitations , 2018, GECCO.

[20]  Julian Togelius,et al.  Transforming Exploratory Creativity with DeLeNoX, , 2021, ICCC.

[21]  Antoine Cully,et al.  Behavioral repertoire learning in robotics , 2013, GECCO '13.

[22]  Anders Lyhne Christensen,et al.  An approach to evolve and exploit repertoires of general robot behaviours , 2018, Swarm Evol. Comput..

[23]  Neil Urquhart,et al.  Optimisation and Illumination of a Real-world Workforce Scheduling and Routing Application via Map-Elites , 2018, PPSN.

[24]  J. Clune,et al.  The Surprising Creativity of Digital Evolution , 2018, ALIFE.

[25]  Kenneth O. Stanley,et al.  Evolving a diversity of virtual creatures through novelty search and local competition , 2011, GECCO '11.

[26]  Risto Miikkulainen,et al.  The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities , 2018, Artificial Life.

[27]  Yiannis Demiris,et al.  Quality and Diversity Optimization: A Unifying Modular Framework , 2017, IEEE Transactions on Evolutionary Computation.

[28]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[29]  Cecilia Laschi,et al.  Evolving soft locomotion in aquatic and terrestrial environments: effects of material properties and environmental transitions , 2017, Soft robotics.

[30]  Pierre-Yves Oudeyer,et al.  Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration , 2018, ICLR.

[31]  Jorge Gomes,et al.  Evolution of Repertoire-Based Control for Robots With Complex Locomotor Systems , 2018, IEEE Transactions on Evolutionary Computation.

[32]  Yiannis Demiris,et al.  Hierarchical behavioral repertoires with unsupervised descriptors , 2018, GECCO.