Multi-cue Contingency Detection

The ability to detect a human’s contingent response is an essential skill for a social robot attempting to engage new interaction partners or maintain ongoing turn-taking interactions. Prior work on contingency detection focuses on single cues from isolated channels, such as changes in gaze, motion, or sound. We propose a framework that integrates multiple cues for detecting contingency from multimodal sensor data in human-robot interaction scenarios. We describe three levels of integration and discuss our method for performing sensor fusion at each of these levels. We perform a Wizard-of-Oz data collection experiment in a turn-taking scenario in which our humanoid robot plays the turn-taking imitation game “Simon says” with human partners. Using this data set, which includes motion and body pose cues from a depth and color image and audio cues from a microphone, we evaluate our contingency detection module with the proposed integration mechanisms and show gains in accuracy of our multi-cue approach over single-cue contingency detection. We show the importance of selecting the appropriate level of cue integration as well as the implications of varying the referent event parameter.

[1]  Javier R. Movellan,et al.  Infomax Control of Eye Movements , 2010, IEEE Transactions on Autonomous Mental Development.

[2]  Erik Schaffernicht,et al.  Whom to talk to? Estimating user interest from movement trajectories , 2008, RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication.

[3]  Daniel Cremers,et al.  Anisotropic Huber-L1 Optical Flow , 2009, BMVC.

[4]  Andrea Lockerd Thomaz,et al.  Simon plays Simon says: The timing of turn-taking in an imitation game , 2011, 2011 RO-MAN.

[5]  Andrea Lockerd Thomaz,et al.  Vision-based contingency detection , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[6]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.

[7]  Andrew Zisserman,et al.  "Here's looking at you, kid". Detecting people looking at each other in videos , 2011, BMVC.

[8]  Minoru Asada,et al.  Reproducing Interaction Contingency Toward Open-Ended Development of Social Actions: Case Study on Joint Attention , 2010, IEEE Transactions on Autonomous Mental Development.

[9]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[10]  Brian Scassellati,et al.  Using probabilistic reasoning over time to self-recognize , 2009, Robotics Auton. Syst..

[11]  Alexander Stoytchev,et al.  Self-detection in robots: a method based on detecting temporal contingencies† , 2011, Robotica.

[12]  Brian Scassellati,et al.  Learning acceptable windows of contingency , 2006, Connect. Sci..

[13]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[14]  Katharina J. Rohlfing,et al.  Which ostensive stimuli can be used for a robot to detect and maintain tutoring situations? , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[15]  Hideaki Kuzuoka,et al.  “The first five seconds”: Contingent stepwise entry into an interaction as a means to secure sustained engagement in HRI , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[16]  Marek P. Michalowski,et al.  A spatial model of engagement for a social robot , 2006, 9th IEEE International Workshop on Advanced Motion Control, 2006..

[17]  Candace L. Sidner,et al.  Recognizing engagement in human-robot interaction , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[18]  C. Teuscher,et al.  Gaze following: why (not) learn it? , 2006, Developmental science.

[19]  Takayuki Kanda,et al.  Footing in human-robot conversations: How robots might shape participant roles using gaze cues , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[20]  G. Csibra,et al.  Social learning and social cognition: The case for pedagogy , 2006 .

[21]  Javier R. Movellan,et al.  Detecting contingencies: An infomax approach , 2010, Neural Networks.

[22]  E. Thoman,et al.  Origins of the infant's social responsiveness , 1979 .

[23]  J. Watson Smiling, cooing, and "the game." , 1972 .