Confidence-Based Demonstration Selection for Interactive Robot Learning

Effective learning by demonstration techniques enable complex robot behaviors to be taught from a small number of demonstrations. Demonstrations are obtained based on a selection algorithm, which governs which states are labeled by the human teacher. In this work, we examine selection algorithms used by the robot to request demonstration examples. Previous approaches typically rely on a fixed confidence threshold. In this work, we highlight the drawbacks of using a single threshold, and contribute an algorithm for automatically setting multiple confidence thresholds designed to target domain states with the greatest uncertainty. Our evaluation compares the proposed multi-threshold selection method to confidence-based selection using a single fixed threshold, and manual data selection by the teacher. Our results indicate that the multi-threshold approach significantly reduces the number of demonstrations required to learn the task.

[1]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[2]  Brett Browning,et al.  Learning by demonstration with critique from a human teacher , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[3]  Monica N. Nicolescu,et al.  Learning and interacting in human-robot domains , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[4]  Andrea Lockerd Thomaz,et al.  Tutelage and socially guided robot learning , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[5]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[6]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[7]  Gordon Cheng,et al.  Learning to Act from Observation and Practice , 2004, Int. J. Humanoid Robotics.

[8]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[9]  Maja J. Matarić,et al.  A framework for learning from demonstration, generalization and practice in human-robot domains , 2003 .