Learning polite behavior with situation models

In this paper, we describe experiments with methods for learning the appropriateness of behaviors based on a model of the current social situation. We first review different approaches for social robotics, and present a new approach based on situation modeling. We then review algorithms for social learning and propose three modifications to the classical Q-Learning algorithm. We describe five experiments with progressively complex algorithms for learning the appropriateness of behaviors. The first three experiments illustrate how social factors can be used to improve learning by controlling learning rate. In the fourth experiment we demonstrate that proper credit assignment improves the effectiveness of reinforcement learning for social interaction. In our fifth experiment we show that analogy can be used to accelerate learning rates in contexts composed of many situations.

[1]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.

[2]  Oliver Brdiczka,et al.  Attentional Model for Perceiving Social Context in Intelligent Environments , 2006, AIAI.

[3]  Philippe Preux,et al.  Propagation of Q-values in Tabular TD(lambda) , 2002, ECML.

[4]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[5]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[6]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[7]  Peter Stone,et al.  A social reinforcement learning agent , 2001, AGENTS '01.

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[9]  Gwen Littlewort,et al.  Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[10]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[11]  Oliver Brdiczka,et al.  Learning Situation Models for Providing Context-Aware Services , 2007, HCI.

[12]  Cynthia Breazeal,et al.  Designing sociable robots , 2002 .

[13]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[14]  James L. Crowley,et al.  Context Driven Observation of Human Activity , 2003, EUSAI.

[15]  P. Johnson-Laird,et al.  Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness , 1985 .

[16]  Young-suk Shin A Neural Network Model for Classification of Facial Expressions Based on Dimension Model , 2005, International Conference on Computational Science.

[17]  Benoit Huet,et al.  Bimodal Emotion Recognition , 2010, ICSR.

[18]  R. Brooks,et al.  The cog project: building a humanoid robot , 1999 .

[19]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[20]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[21]  G. Cottrell,et al.  A Simple Neural Network Models Categorical Perception of Facial Expressions , 1998 .

[22]  A. H. Klopf,et al.  Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  Brian Scassellati,et al.  Humanoid Robots: A New Kind of Tool , 2000, IEEE Intell. Syst..

[25]  Stephanie Rosenthal,et al.  Designing robots for long-term social interaction , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Andrea L. Thomaz,et al.  Socially guided machine learning , 2006 .

[27]  C.D. Kidd,et al.  Designing a sociable robot system forweight maintenance , 2006, CCNC 2006. 2006 3rd IEEE Consumer Communications and Networking Conference, 2006..

[28]  P. Johnson-Laird How We Reason , 2006 .

[29]  Oliver Brdiczka,et al.  Learning individual roles from video in a smart home , 2006 .

[30]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.