Efficient sensorimotor learning from multiple demonstrations

Abstract In this paper, we present a new approach to the problem of learning motor primitives, which combines ideas from statistical generalization and error learning. The learning procedure is formulated in two stages. The first stage is based on the generalization of previously trained movements associated with a specific task configuration, which results in a first approximation of a suitable control policy in a new situation. The second stage applies learning in the subspace defined by the previously acquired training data, which results in a learning problem in constrained domain. We show that reinforcement learning in constrained domain can be interpreted as an error-learning algorithm. Furthermore, we propose modifications to speed up the learning process. The proposed approach was tested both in simulation and experimentally on two challenging tasks: learning of matchbox flip-up and pouring.

[1]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[2]  J. E. Glynn,et al.  Numerical Recipes: The Art of Scientific Computing , 1989 .

[3]  Gregor Schöner,et al.  The uncontrolled manifold concept: identifying control variables for a functional task , 1999, Experimental Brain Research.

[4]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[5]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[6]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[7]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  KasabovNikola,et al.  2008 Special issue , 2008 .

[10]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[11]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[12]  B. Nemec,et al.  MODELING OF ROBOT LEARNING IN MATLAB/SIMULINK ENVIRONMENT , 2010 .

[13]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[14]  Giorgio Metta,et al.  Learning the skill of archery by a humanoid robot iCub , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[15]  Jun Morimoto,et al.  Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.

[16]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[17]  Ales Ude,et al.  Exploiting previous experience to constrain robot sensorimotor learning , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[18]  D. Wolpert,et al.  Principles of sensorimotor learning , 2011, Nature Reviews Neuroscience.

[19]  Aude Billard,et al.  Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.

[20]  Jan Peters,et al.  Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .

[21]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.