Self-Evaluation in One-Shot Learning from Demonstration of Contact-Intensive Tasks

Humans naturally "program" a fellow collaborator to perform a task by demonstrating the task few times. It is intuitive, therefore, for a human to program a collaborative robot by demonstration and many paradigms use a single demonstration of the task. This is a form of one-shot learning in which a single training example, plus some context of the task, is used to infer a model of the task for subsequent execution and later refinement. This paper presents a one-shot learning from demonstration framework to learn contact-intensive tasks using only visual perception of the demonstrated task. The robot learns a policy for performing the tasks in terms of a priori skills and further uses self-evaluation based on visual and tactile perception of the skill performance to learn the force correspondences for the skills. The self-evaluation is performed based on goal states detected in the demonstration with the help of task context and the skill parameters are tuned using reinforcement learning. This approach enables the robot to learn force correspondences which cannot be inferred from a visual demonstration of the task. The effectiveness of this approach is evaluated using a vegetable peeling task.

[1]  Chrystopher L. Nehaniv,et al.  Imitation as a Dual-Route Process Featuring Predictive and Learning Components: A Biologically Plausible Computational Model , 2002 .

[2]  Jun Wan,et al.  Explore Efficient Local Features from RGB-D Data for One-Shot Learning Gesture Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Aude Billard,et al.  Learning from Humans , 2016, Springer Handbook of Robotics, 2nd Ed..

[4]  Jean-Claude Latombe,et al.  An Approach to Automatic Robot Programming Based on Inductive Learning , 1984 .

[5]  Brett Browning,et al.  Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot , 2011, Robotics Auton. Syst..

[6]  Maya Cakmak,et al.  Learning generalizable surface cleaning actions from demonstration , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[7]  Gillian M. Hayes,et al.  Imitation as a dual-route process featuring prediction and learning components: A biologically plaus , 2002 .

[8]  P. Lucas,et al.  Mechanical properties of foods responsible for resisting food breakdown in the human mouth. , 1997, Archives of oral biology.

[9]  Mythra V. Balakuntala,et al.  Glovebox Handling of High-Consequence Materials with Super Baxter and Gesture-Based Programming-18598 , 2018 .

[10]  S. Münch,et al.  Robot Programming by Demonstration (RPD) - Using Machine Learning and User Interaction Methods for the Development of Easy and Comfortable Robot Programming Systems , 2000 .

[11]  Manuel Lopes,et al.  Temporal segmentation of pair-wise interaction phases in sequential manipulation demonstrations , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Stefan Schaal,et al.  Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.

[13]  Juan Pablo Wachs,et al.  One-Shot Gesture Recognition: One Step Towards Adaptive Learning , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[14]  Sergey Levine,et al.  Learning force-based manipulation of deformable objects from multiple demonstrations , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Pradeep K. Khosla,et al.  Gesture-based programming: a preliminary demonstration , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[16]  Maya Cakmak,et al.  Efficient programming of manipulation tasks by demonstration and adaptation , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[17]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Maya Cakmak,et al.  Exploiting social partners in robot learning , 2010, Auton. Robots.

[19]  Manuel Lopes,et al.  Robot programming from demonstration, feedback and transfer , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Susan H. Williams,et al.  Mechanical properties of foods used in experimental studies of primate masticatory function , 2005, American journal of primatology.

[21]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[22]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.