论文信息 - Interactive Learning from Unlabeled Instructions

Interactive Learning from Unlabeled Instructions

Interactive learning deals with the problem of learning and solving tasks using human instructions. It is common in human-robot interaction, tutoring systems, and in human-computer interfaces such as brain-computer ones. In most cases, learning these tasks is possible because the signals are predefined or an ad-hoc calibration procedure allows to map signals to specific meanings. In this paper, we address the problem of simultaneously solving a task under human feedback and learning the associated meanings of the feedback signals. This has important practical application since the user can start controlling a device from scratch, without the need of an expert to define the meaning of signals or carrying out a calibration phase. The paper proposes an algorithm that simultaneously assign meanings to signals while solving a sequential task under the assumption that both, human and machine, share the same a priori on the possible instruction meanings and the possible tasks. Furthermore, we show using synthetic and real EEG data from a brain-computer interface that taking into account the uncertainty of the task and the signal is necessary for the machine to actively plan how to solve the task efficiently.

[1] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2] Thomas G. Dietterich,et al. Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.

[3] I Iturrate,et al. Task-dependent signal variations in EEG error-related potentials for brain–computer interfaces , 2013, Journal of neural engineering.

[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[6] Benjamin Schrauwen,et al. A Bayesian Model for Exploiting Application Constraints to Enable Unsupervised Training of a P300-based BCI , 2012, PloS one.

[7] Pierre-Yves Oudeyer,et al. Robotic clicker training , 2002, Robotics Auton. Syst..

[8] Pierre-Yves Oudeyer,et al. Simultaneous acquisition of task and feedback models , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[9] Pierre-Yves Oudeyer,et al. Calibration-Free BCI Based Control , 2014, AAAI.

[10] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.

[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[12] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[13] Manuela M. Veloso,et al. Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[14] Manuel C. Lopes,et al. Robot self-initiative and personalization by learning through repeated interactions , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[15] Stefan Haufe,et al. Single-trial analysis and classification of ERP components — A tutorial , 2011, NeuroImage.

[16] Andrea Lockerd Thomaz,et al. Tutelage and Collaboration for Humanoid Robots , 2004, Int. J. Humanoid Robotics.

[17] Klaus-Robert Müller,et al. Integrating dynamic stopping, transfer learning and language models in an adaptive zero-training ERP speller , 2014, Journal of neural engineering.

[18] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[19] Manuel Lopes,et al. Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[20] Monica N. Nicolescu,et al. Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[21] Stefano Nolfi,et al. Bottom-up learning of feedback in a categorization task , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[22] John K Kruschke,et al. Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[23] R Chavarriaga,et al. Learning From EEG Error-Related Potentials in Noninvasive Brain-Computer Interfaces , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[24] Pierre-Yves Oudeyer,et al. Robot learning simultaneously a task and how to interpret human instructions , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).