Simultaneous acquisition of task and feedback models

We present a system to learn task representations from ambiguous feedback. We consider an inverse reinforcement learner that receives feedback from a teacher with an unknown and noisy protocol. The system needs to estimate simultaneously what the task is (i.e. how to find a compact representation to the task goal), and how the teacher is providing the feedback. We further explore the problem of ambiguous protocols by considering that the words used by the teacher have an unknown relation with the action and meaning expected by the robot. This allows the system to start with a set of known signs and learn the meaning of new ones. We present computational results that show that it is possible to learn the task under a noisy and ambiguous feedback. Using an active learning approach, the system is able to reduce the length of the training period.

[1]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[2]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[3]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[4]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[5]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[7]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[8]  Edmund H. Durfee,et al.  Selecting Operator Queries Using Expected Myopic Gain , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[9]  Maya Cakmak,et al.  Designing Interactions for Robot Active Learners , 2010, IEEE Transactions on Autonomous Mental Development.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Maya Cakmak,et al.  Optimality of human teachers for robot learners , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[12]  Andrea Lockerd Thomaz,et al.  Tutelage and Collaboration for Humanoid Robots , 2004, Int. J. Humanoid Robotics.

[13]  Thomas G. Dietterich,et al.  Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.

[14]  Andrea Lockerd Thomaz,et al.  Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[15]  Wolfram Burgard,et al.  Particle Filters for Mobile Robot Localization , 2001, Sequential Monte Carlo Methods in Practice.

[16]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[17]  José Santos-Victor,et al.  Abstraction Levels for Robotic Imitation: Overview and Computational Approaches , 2010, From Motor Learning to Interaction Learning in Robots.

[18]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[19]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[20]  Peter Stone,et al.  Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.

[21]  Manuel C. Lopes,et al.  Robot self-initiative and personalization by learning through repeated interactions , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[22]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[23]  Marko Grobelnik,et al.  Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II , 2009 .

[24]  Manuel Lopes,et al.  Affordance-based imitation learning in robots , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  L. Steels Experiments on the emergence of human communication , 2006, Trends in Cognitive Sciences.

[26]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  Toyoaki Nishida,et al.  Learning interaction protocols using Augmented Baysian Networks applied to guided navigation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.