Multimodal conversational interaction with a humanoid robot

The paper presents a multimodal conversational interaction system for the Nao humanoid robot. The system was developed at the 8th International Summer Workshop on Multimodal Interfaces, Metz, 2012. We implemented WikiTalk, an existing spoken dialogue system for open-domain conversations, on Nao. This greatly extended the robot's interaction capabilities by enabling Nao to talk about an unlimited range of topics. In addition to speech interaction, we developed a wide range of multimodal interactive behaviours by the robot, including face-tracking, nodding, communicative gesturing, proximity detection and tactile interrupts. We made video recordings of user interactions and used questionnaires to evaluate the system. We further extended the robot's capabilities by linking Nao with Kinect.

[1]  Kristiina Jokinen,et al.  Constructive Interaction for Talking about Interesting Topics , 2012, LREC.

[2]  Dimitra Anastasiou,et al.  Evaluation of WikiTalk - User Studies of Human-Robot Interaction , 2013, HCI.

[3]  Nick Campbell,et al.  Investigating the use of Non-verbal Cues in Human-Robot Interaction with a Nao robot , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[4]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Kristiina Jokinen,et al.  Integration of gestures and speech in human-robot interaction , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[6]  Costanza Navarretta,et al.  Feedback in Nordic First-Encounters: a Comparative Study , 2012, LREC.

[7]  J. Allwood BODILY COMMUNICATION DIMENSIONS OF EXPRESSION AND CONTENT , 2002 .

[8]  Graham Wilcock WikiTalk: A Spoken Wikipedia-based Open-Domain Knowledge Access System , 2012, Coling 2012.

[9]  A. Kendon Gesture: Visible Action as Utterance , 2004 .

[10]  Janusz Konrad,et al.  A gesture-driven computer interface using Kinect , 2012, 2012 IEEE Southwest Symposium on Image Analysis and Interpretation.

[11]  Kristiina Jokinen,et al.  User expectations and real experience on a multimodal interactive system , 2006, INTERSPEECH.

[12]  Francis K. H. Quek,et al.  Toward a vision-based hand gesture interface , 1994 .

[13]  Aryel Beck,et al.  Towards an Affect Space for robots to display emotional body language , 2010, 19th International Symposium in Robot and Human Interactive Communication.