Robot motor skill coordination with EM-based Reinforcement Learning

We present an approach allowing a robot to acquire new motor skills by learning the couplings across motor control variables. The demonstrated skill is first encoded in a compact form through a modified version of Dynamic Movement Primitives (DMP) which encapsulates correlation information. Expectation-Maximization based Reinforcement Learning is then used to modulate the mixture of dynamical systems initialized from the user's demonstration. The approach is evaluated on a torque-controlled 7 DOFs Barrett WAM robotic arm. Two skill learning experiments are conducted: a reaching task where the robot needs to adapt the learned movement to avoid an obstacle, and a dynamic pancake-flipping task.

[1]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[2]  Jun Nakanishi,et al.  Trajectory formation for imitation with nonlinear dynamical systems , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[3]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[4]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[5]  Andreas Daffertshofer,et al.  The evolution of coordination during skill acquisition: The dynamical systems approach , 2004 .

[6]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[7]  Michael T. Rosenstein,et al.  Learning at the level of synergies for a robot weightlifter , 2006, Robotics Auton. Syst..

[8]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[10]  Stefan Schaal,et al.  Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.

[11]  Cecilio Angulo,et al.  Collaborative control in a humanoid dynamic task , 2007, ICINCO-RA.

[12]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[13]  Darwin G. Caldwell,et al.  Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[14]  Jan Peters,et al.  Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[15]  B. Schölkopf,et al.  Reinforcement Learning for Motor Primitives , 2009 .

[16]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[17]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[18]  Pieter Abbeel,et al.  Apprenticeship learning for helicopter control , 2009, CACM.

[19]  Stefan Schaal,et al.  Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance , 2009, 2009 IEEE International Conference on Robotics and Automation.

[20]  Diego Esteban Pardo Ayala Learning rest-to-rest motor coordination in articulated mobile robots , 2009 .

[21]  Anthony Jarc,et al.  Simplified and effective motor control based on muscle synergies to exploit musculoskeletal dynamics , 2009, Proceedings of the National Academy of Sciences.

[22]  Stefan Schaal,et al.  Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[23]  Aude Billard,et al.  BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models , 2010, 2010 IEEE International Conference on Robotics and Automation.

[24]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[25]  Darwin G. Caldwell,et al.  Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Darwin G. Caldwell,et al.  Learning and Reproduction of Gestures by Imitation , 2010, IEEE Robotics & Automation Magazine.

[27]  Eric L. Sauser,et al.  An Approach Based on Hidden Markov Model and Gaussian Mixture Regression , 2010 .