Robot learning from demonstrations: Emulation learning in environments with moving obstacles

Abstract In this paper, we present an approach to the problem of Robot Learning from Demonstration (RLfD) in a dynamic environment, i.e. an environment whose state changes throughout the course of performing a task. RLfD mostly has been successfully exploited only in non-varying environments to reduce the programming time and cost, e.g. fixed manufacturing workspaces. Non-conventional production lines necessitate Human–Robot Collaboration (HRC) implying robots and humans must work in shared workspaces. In such conditions, the robot needs to avoid colliding with the objects that are moved by humans in the workspace. Therefore, not only is the robot: (i) required to learn a task model from demonstrations; but also, (ii) must learn a control policy to avoid a stationary obstacle. Furthermore, (iii) it needs to build a control policy from demonstration to avoid moving obstacles. Here, we present an incremental approach to RLfD addressing all these three problems. We demonstrate the effectiveness of the proposed RLfD approach, by a series of pick-and-place experiments by an ABB YuMi robot. The experimental results show that a person can work in a workspace shared with a robot where the robot successfully avoids colliding with him.

[1]  Jan Peters,et al.  Phase estimation for fast action recognition and trajectory generation in human–robot collaboration , 2017, Int. J. Robotics Res..

[2]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[3]  D. Wolpert,et al.  Principles of sensorimotor learning , 2011, Nature Reviews Neuroscience.

[4]  R. Johansson,et al.  Prediction Precedes Control in Motor Learning , 2003, Current Biology.

[5]  Siddhartha S. Srinivasa,et al.  CHOMP: Gradient optimization techniques for efficient motion planning , 2009, 2009 IEEE International Conference on Robotics and Automation.

[6]  Andrej Gams,et al.  Learning of parametric coupling terms for robot-environment interaction , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[7]  Stefan Schaal,et al.  Learning feedback terms for reactive planning and control , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[9]  Ales Leonardis,et al.  Task-relevant grasp selection: A joint solution to planning grasps and manipulative motion trajectories , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Stefan Schaal,et al.  STOMP: Stochastic trajectory optimization for motion planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Stefan Schaal,et al.  Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[12]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[13]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[14]  Gregory D. Hager,et al.  An incremental approach to learning generalizable robot tasks from human demonstration , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Ajay Kumar Tanwani,et al.  Learning Robot Manipulation Tasks With Task-Parameterized Semitied Hidden Semi-Markov Model , 2016, IEEE Robotics and Automation Letters.

[16]  Roderic A. Grupen,et al.  Learning prospective pick and place behavior , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.

[17]  Brett R. Fajen,et al.  Behavioral dynamics of steering, obstacle avoidance, and route selection. , 2003 .

[18]  Reza Shadmehr,et al.  Learning of action through adaptive combination of motor primitives , 2000, Nature.

[19]  J. Russell,et al.  The ghost condition: imitation versus emulation in young children's observational learning. , 2004, Developmental psychology.

[20]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[21]  Aude Billard,et al.  Learning from Humans , 2016, Springer Handbook of Robotics, 2nd Ed..

[22]  Stefan Schaal,et al.  Learning coupling terms for obstacle avoidance , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[23]  E. Todorov Optimality principles in sensorimotor control , 2004, Nature Neuroscience.

[24]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[25]  Lydia M. Hopper,et al.  Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[26]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[27]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[28]  Jan Peters,et al.  Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .

[29]  M. Powell A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation , 1994 .

[30]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Emilio Frazzoli,et al.  Incremental Sampling-based Algorithms for Optimal Motion Planning , 2010, Robotics: Science and Systems.

[32]  A. Whiten,et al.  On the Nature and Evolution of Imitation in the Animal Kingdom: Reappraisal of a Century of Research , 1992 .

[33]  Darwin G. Caldwell,et al.  Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Dimitar Dimitrov,et al.  Model Predictive Motion Control based on Generalized Dynamical Movement Primitives , 2015, J. Intell. Robotic Syst..

[35]  Stefan Schaal,et al.  Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance , 2009, 2009 IEEE International Conference on Robotics and Automation.