Mutual Reinforcement Learning

Recently, collaborative robots have begun to train humans to achieve complex tasks, and the mutual information exchange between them can lead to successful robot-human collaborations. In this paper we demonstrate the application and effectiveness of a new approach called mutual reinforcement learning (MRL), where both humans and autonomous agents act as reinforcement learners in a skill transfer scenario over continuous communication and feedback. An autonomous agent initially acts as an instructor who can teach a novice human participant complex skills using the MRL strategy. While teaching skills in a physical (block-building) ($n=34$) or simulated (Tetris) environment ($n=31$), the expert tries to identify appropriate reward channels preferred by each individual and adapts itself accordingly using an exploration-exploitation strategy. These reward channel preferences can identify important behaviors of the human participants, because they may well exercise the same behaviors in similar situations later. In this way, skill transfer takes place between an expert system and a novice human operator. We divided the subject population into three groups and observed the skill transfer phenomenon, analyzing it with Simpson"s psychometric model. 5-point Likert scales were also used to identify the cognitive models of the human participants. We obtained a shared cognitive model which not only improves human cognition but enhances the robot's cognitive strategy to understand the mental model of its human partners while building a successful robot-human collaborative framework.

[1]  E. Simpson THE CLASSIFICATION OF EDUCATIONAL OBJECTIVES, PSYCHOMOTOR DOMAIN. , 1966 .

[2]  D. Prozesky Teaching and learning. , 2000, Community eye health.

[3]  Brian Scassellati,et al.  Foundations for a theory of mind for a humanoid robot , 2001 .

[4]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  Sara B. Kiesler,et al.  Human Mental Models of Humanoid Robots , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[7]  G. Stojanov,et al.  Interactivism in artificial intelligence (AI) and intelligent robotics , 2006 .

[8]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[9]  I. René J. A. te Boekhorst,et al.  Learning about natural human-robot interaction styles , 2006, Robotics Auton. Syst..

[10]  P. Stone,et al.  TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.

[11]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[12]  Maya Cakmak,et al.  Effects of social exploration mechanisms on robot learning , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[13]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[14]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[15]  M. Bickhard Interactivism: A manifesto , 2009 .

[16]  Leila Takayama,et al.  Communication and knowledge sharing in human-robot interaction and learning from demonstration , 2010, Neural Networks.

[17]  Peter Ford Dominey,et al.  The basis of shared intentions in human and robot cognition , 2011 .

[18]  Juan Fasola,et al.  A socially assistive robot exercise coach for the elderly , 2013, J. Hum. Robot Interact..

[19]  Ana Paiva,et al.  Long-Term Interactions with Empathic Robots: Evaluating Perceived Support in Children , 2012, ICSR.

[20]  Elizabeth S. Kim,et al.  Social Robots as Embedded Reinforcers of Social Behavior in Children with Autism , 2012, Journal of Autism and Developmental Disorders.

[21]  M. Lăzărescu The Structure and Dynamics of the Teacher's Empathic Behavior☆ , 2013 .

[22]  Stefanos Nikolaidis,et al.  Human-robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[23]  Andrea Lockerd Thomaz,et al.  Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[24]  S. Keskin,et al.  From what isn’t Empathy to Empathic Learning Process , 2014 .

[25]  Kenji Suzuki,et al.  Humanoid Robot Assisted Training for Facial Expressions Recognition Based on Affective Feedback , 2015, ICSR.

[26]  Pierre-Yves Oudeyer,et al.  Multi-Armed Bandits for Intelligent Tutoring Systems , 2013, EDM.

[27]  Fabrice Lefèvre,et al.  Reinforcement-learning based dialogue system for human-robot interactions with socially-inspired rewards , 2015, Comput. Speech Lang..

[28]  W. Gardner,et al.  The way I make you feel: Social exclusion enhances the ability to manage others' emotions , 2015 .

[29]  Ana Paiva,et al.  Bidirectional Learning of Handwriting Skill in Human-Robot Interaction , 2015, HRI.

[30]  Brian Scassellati,et al.  Developing Adaptive Social Robot Tutors for Children , 2015, AAAI Fall Symposia.

[31]  Brian Scassellati,et al.  Robotic Coaching of Complex Physical Skills , 2015, HRI.

[32]  Allison Sauppé,et al.  Effective task training strategies for human and robot instructors , 2015, Auton. Robots.

[33]  Anca D. Dragan,et al.  Cooperative Inverse Reinforcement Learning , 2016, NIPS.

[34]  Katherine J. Kuchenbecker,et al.  Designing and Assessing Expressive Open-Source Faces for the Baxter Robot , 2016, ICSR.

[35]  P. Lockwood The anatomy of empathy: Vicarious experience and disorders of social cognition , 2016, Behavioural Brain Research.

[36]  Claire E. Foster,et al.  Emotional robot to examine different play patterns and affective responses of children with and without ASD , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[37]  Cynthia Breazeal,et al.  Growing Growth Mindset with a Social Robot Peer , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[38]  Philip Powell,et al.  Situational determinants of cognitive, affective, and compassionate empathy in naturalistic digital interactions , 2017, Comput. Hum. Behav..

[39]  Cynthia Breazeal,et al.  A Social Robot System for Modeling Children's Word Pronunciation: Socially Interactive Agents Track , 2018, AAMAS.

[40]  Tao Qin,et al.  Learning to Teach , 2018, ICLR.

[41]  Sayanti Roy,et al.  A Reinforcement Learning Model for Robots as Teachers* , 2018, 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[42]  Maosong Sun,et al.  Automatic Poetry Generation with Mutual Reinforcement Learning , 2018, EMNLP.

[43]  Márta Gácsi,et al.  Should we love robots? - The most liked qualities of companion dogs and how they can be implemented in social robots , 2018, Comput. Hum. Behav..

[44]  Sayanti Roy,et al.  Using Human Reinforcement Learning Models to Improve Robots as Teachers , 2018, HRI.

[45]  Tamar Tas,et al.  Learning to teach , 2018 .

[46]  Bradley Hayes,et al.  Explanation-Based Reward Coaching to Improve Human Performance via Reinforcement Learning , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[47]  Sayanti Roy,et al.  Mutual Reinforcement Learning with Robot Trainers , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[48]  Siddhartha S. Srinivasa,et al.  The Assistive Multi-Armed Bandit , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).