Robust Robot Learning from Demonstration and Skill Repair Using Conceptual Constraints

Learning from demonstration (LfD) has enabled robots to rapidly gain new skills and capabilities by leveraging examples provided by novice human operators. While effective, this training mechanism presents the potential for sub-optimal demonstrations to negatively impact performance due to unintentional operator error. In this work we introduce Concept Constrained Learning from Demonstration (CC-LfD), a novel algorithm for robust skill learning and skill repair that incorporates annotations of conceptually-grounded constraints (in the form of planning predicates) during live demonstrations into the LfD process. Through our evaluation, we show that CC-LfD can be used to quickly repair skills with as little as a single annotated demonstration without the need to identify and remove low-quality demonstrations. We also provide evidence for potential applications to transfer learning, whereby constraints can be used to adapt demonstrations from a related task to achieve proficiency with few new demonstrations required.

[1]  Brian Scassellati,et al.  Discovering task constraints through observation and active learning , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Aude Billard,et al.  Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.

[3]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[4]  Andrea Lockerd Thomaz,et al.  Robot Learning from Human Teachers , 2014, Robot Learning from Human Teachers.

[5]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[6]  Andrea Lockerd Thomaz,et al.  Simultaneously learning actions and goals from demonstration , 2016, Auton. Robots.

[7]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[8]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[9]  Sylvain Calinon,et al.  Learning adaptive dressing assistance from human demonstration , 2017, Robotics Auton. Syst..

[10]  Andrew T. Irish,et al.  Trajectory Learning for Robot Programming by Demonstration Using Hidden Markov Model and Dynamic Time Warping , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[12]  Elin Anna Topp,et al.  Simplified Programming of Re-Usable Skills on a Safe Industrial Robot - Prototype and Evaluation , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[13]  Mukesh Singhal,et al.  Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries , 2018, HRI.

[14]  Yu Zhao,et al.  Robot learning from human demonstration with remote lead hrough teaching , 2016, 2016 European Control Conference (ECC).

[15]  Maya Cakmak,et al.  Robot Programming by Demonstration with Interactive Action Visualizations , 2014, Robotics: Science and Systems.

[16]  Maya Cakmak,et al.  Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[17]  Maya Cakmak,et al.  Keyframe-based Learning from Demonstration , 2012, Int. J. Soc. Robotics.

[18]  Thorsten Joachims,et al.  Learning Trajectory Preferences for Manipulators via Iterative Improvement , 2013, NIPS.

[19]  Andrea Lockerd Thomaz,et al.  An evaluation of GUI and kinesthetic teaching methods for constrained-keyframe skills , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Brian Scassellati,et al.  Autonomously constructing hierarchical task networks for planning and human-robot collaboration , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Sachin Chitta,et al.  MoveIt! [ROS Topics] , 2012, IEEE Robotics Autom. Mag..

[22]  Maya Cakmak,et al.  Towards grounding concepts for transfer in goal learning from demonstration , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[23]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[24]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[25]  Najdan Vukovic,et al.  Trajectory learning and reproduction for differential drive mobile robots based on GMM/HMM and dynamic time warping using learning from demonstration framework , 2015, Eng. Appl. Artif. Intell..

[26]  Pieter Abbeel,et al.  Learning for control from multiple demonstrations , 2008, ICML '08.

[27]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[28]  Maya Cakmak,et al.  Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[29]  Bradley Hayes,et al.  Interpretable models for fast activity recognition and anomaly explanation during collaborative robotics tasks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Danica Kragic,et al.  Robot Learning from Demonstration: A Task-level Planning Approach , 2008 .

[31]  Peter Bakker,et al.  Robot see, robot do: An overview of robot imitation , 1996 .

[32]  K. Dautenhahn,et al.  Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions , 2009 .

[33]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.