Interactive Learning from Unlabeled Instructions

Interactive learning deals with the problem of learning and solving tasks using human instructions. It is common in human-robot interaction, tutoring systems, and in human-computer interfaces such as brain-computer ones. In most cases, learning these tasks is possible because the signals are predefined or an ad-hoc calibration procedure allows to map signals to specific meanings. In this paper, we address the problem of simultaneously solving a task under human feedback and learning the associated meanings of the feedback signals. This has important practical application since the user can start controlling a device from scratch, without the need of an expert to define the meaning of signals or carrying out a calibration phase. The paper proposes an algorithm that simultaneously assign meanings to signals while solving a sequential task under the assumption that both, human and machine, share the same a priori on the possible instruction meanings and the possible tasks. Furthermore, we show using synthetic and real EEG data from a brain-computer interface that taking into account the uncertainty of the task and the signal is necessary for the machine to actively plan how to solve the task efficiently.

[1]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2]  Thomas G. Dietterich,et al.  Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.

[3]  I Iturrate,et al.  Task-dependent signal variations in EEG error-related potentials for brain–computer interfaces , 2013, Journal of neural engineering.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[6]  Benjamin Schrauwen,et al.  A Bayesian Model for Exploiting Application Constraints to Enable Unsupervised Training of a P300-based BCI , 2012, PloS one.

[7]  Pierre-Yves Oudeyer,et al.  Robotic clicker training , 2002, Robotics Auton. Syst..

[8]  Pierre-Yves Oudeyer,et al.  Simultaneous acquisition of task and feedback models , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[9]  Pierre-Yves Oudeyer,et al.  Calibration-Free BCI Based Control , 2014, AAAI.

[10]  Andrew Y. Ng,et al.  Near-Bayesian exploration in polynomial time , 2009, ICML '09.

[11]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[12]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[13]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[14]  Manuel C. Lopes,et al.  Robot self-initiative and personalization by learning through repeated interactions , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[15]  Stefan Haufe,et al.  Single-trial analysis and classification of ERP components — A tutorial , 2011, NeuroImage.

[16]  Andrea Lockerd Thomaz,et al.  Tutelage and Collaboration for Humanoid Robots , 2004, Int. J. Humanoid Robotics.

[17]  Klaus-Robert Müller,et al.  Integrating dynamic stopping, transfer learning and language models in an adaptive zero-training ERP speller , 2014, Journal of neural engineering.

[18]  Andrea Lockerd Thomaz,et al.  Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[19]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[20]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[21]  Stefano Nolfi,et al.  Bottom-up learning of feedback in a categorization task , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[22]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[23]  R Chavarriaga,et al.  Learning From EEG Error-Related Potentials in Noninvasive Brain-Computer Interfaces , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[24]  Pierre-Yves Oudeyer,et al.  Robot learning simultaneously a task and how to interpret human instructions , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).