Learning equivalent action choices from demonstration

In their interactions with the world robots inevitably face equivalent action choices, situations in which multiple actions are equivalently applicable. In this paper, we address the problem of equivalent action choices in learning from demonstration, a robot learning approach in which a policy is acquired from human demonstrations of the desired behavior. We note that when faced with a choice of equivalent actions, a human teacher often demonstrates an action arbitrarily and does not make the choice consistently over time. The resulting inconsistently labeled training data poses a problem for classification-based demonstration learning algorithms by violating the common assumption that for any world state there exists a single best action. This problem has been overlooked by previous approaches for demonstration learning. In this paper, we present an algorithm that identifies regions of the state space with conflicting demonstrations and enables the choice between multiple actions to be represented explicitly within the robotpsilas policy. An experimental evaluation of the algorithm in a real-world obstacle avoidance domain shows that reasoning about action choices significantly improves the robotpsilas learning performance.

[1]  Tetsunari Inamura Masayuki Inaba Hirochika Acquisition of Probabilistic Behavior Decision Model based on the Interactive Teaching Method , 2001 .

[2]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[3]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[4]  Darrin C. Bentivegna,et al.  Learning From Observation and Practice Using Primitives , 2004 .

[5]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[6]  Andrea Lockerd Thomaz,et al.  Teaching and working with robots as a collaboration , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[7]  Manuela M. Veloso,et al.  Multi-thresholded approach to demonstration selection for interactive robot learning , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[8]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[9]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[10]  Gordon Cheng,et al.  Learning to Act from Observation and Practice , 2004, Int. J. Humanoid Robotics.

[11]  Andrea Lockerd Thomaz,et al.  Tutelage and socially guided robot learning , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[12]  Manuela M. Veloso,et al.  Teaching multi-robot coordination using demonstration of communication and state sharing , 2008, AAMAS.

[13]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[14]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[15]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[16]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[17]  Monica N. Nicolescu,et al.  Learning and interacting in human-robot domains , 2001, IEEE Trans. Syst. Man Cybern. Part A.