Robot skill learning in latent space of a deep autoencoder neural network

Abstract Just like humans, robots can improve their performance by practicing, i. e. by performing the desired behavior many times and updating the underlying skill representation using the newly gathered data. In this paper, we propose to implement robot practicing by applying statistical and reinforcement learning (RL) in a latent space of the selected skill representation. The latent space is computed by a deep autoencoder neural network, with the data to train the network generated in simulation. However, we show that the resulting latent space representation is useful also for learning on a real robot. Our simulation and real-world results demonstrate that by exploiting the latent space of the underlying motor skill representation, a significant reduction of the amount of data needed for effective learning by Gaussian Process Regression (GPR) can be achieved. Similarly, the number of RL epochs can be significantly reduced. Finally, it is evident from our results that an autoencoder-based latent space is more effective for these purposes than a latent space computed by principal component analysis.

[1]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[2]  You Zhou,et al.  Movement Primitive Learning and Generalization: Using Mixture Density Networks , 2020, IEEE Robotics & Automation Magazine.

[3]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[4]  Jun Morimoto,et al.  Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.

[5]  Oliver Kroemer,et al.  Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[6]  Andrej Gams,et al.  Learning Compliant Movement Primitives Through Demonstration and Statistical Generalization , 2016, IEEE/ASME Transactions on Mechatronics.

[7]  Oliver Kroemer,et al.  Towards Robot Skill Learning: From Simple Skills to Table Tennis , 2013, ECML/PKDD.

[8]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[9]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[10]  Andrej Gams,et al.  Generalization of orientation trajectories and force-torque profiles for robotic assembly , 2017, Robotics Auton. Syst..

[11]  Kerstin Dautenhahn,et al.  Solving the Correspondence Problem Between Dissimilarly Embodied Robotic Arms Using the ALICE Imitation Mechanism , 2003 .

[12]  Roland Memisevic,et al.  The Potential Energy of an Autoencoder , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ales Ude,et al.  Efficient sensorimotor learning from multiple demonstrations , 2013, Adv. Robotics.

[14]  Stefan Schaal,et al.  Locally Weighted Projection Regression: Incremental Real Time Learning in High Dimensional Space , 2000, ICML.

[15]  Sylvain Calinon,et al.  A tutorial on task-parameterized movement learning and retrieval , 2015, Intelligent Service Robotics.

[16]  Jun Morimoto,et al.  Skill learning and action recognition by arc-length dynamic movement primitives , 2018, Robotics Auton. Syst..

[17]  Jun Morimoto,et al.  Learning parametric dynamic movement primitives from multiple demonstrations , 2011, Neural Networks.

[18]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[19]  D. Wolpert,et al.  Principles of sensorimotor learning , 2011, Nature Reviews Neuroscience.

[20]  Jun Morimoto,et al.  On-line motion synthesis and adaptation using a trajectory database , 2012, Robotics Auton. Syst..

[21]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[22]  Jun Morimoto,et al.  Training of deep neural networks for the generation of dynamic movement primitives , 2020, Neural Networks.

[23]  Tom Drummond,et al.  Traversing Latent Space using Decision Ferns , 2018, ACCV.

[24]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[25]  Andrej Gams,et al.  Accelerated Sensorimotor Learning of Compliant Movement Primitives , 2018, IEEE Transactions on Robotics.

[26]  Rüdiger Dillmann,et al.  Teaching and learning of robot tasks via observation of human performance , 2004, Robotics Auton. Syst..

[27]  Lei Le,et al.  Supervised autoencoders: Improving generalization performance with unsupervised regularizers , 2018, NeurIPS.

[28]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[29]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[30]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .