Multimodal Open-Domain Conversations with the Nao Robot

In this paper we discuss the design of human-robot interaction focussing especially on social robot communication and multimodal information presentation. As a starting point we use the WikiTalk application, an open-domain conversational system which has been previously developed using a robotics simulator. We describe how it can be implemented on the Nao robot platform, enabling Nao to make informative spoken contributions on a wide range of topics during conversation. Spoken interaction is further combined with gesturing in order to support Nao’s presentation by natural multimodal capabilities, and to enhance and explore natural communication between human users and robots.

[1]  Francis K. H. Quek,et al.  Toward a vision-based hand gesture interface , 1994 .

[2]  Kristiina Jokinen,et al.  Pointing Gestures and Synchronous Communication Management , 2009, COST 2102 Training School.

[3]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[4]  Kristiina Jokinen,et al.  Constructive Dialogue Modelling - Speech Interaction and Rational Agents , 2009, Wiley series in agent technology.

[5]  Kristiina Jokinen,et al.  Integration of gestures and speech in human-robot interaction , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[6]  Kristiina Jokinen,et al.  Emergent verbal behaviour in human-robot interaction , 2011, 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom).

[7]  William C. Mann,et al.  Natural Language Generation in Artificial Intelligence and Computational Linguistics , 1990 .

[8]  Graham Wilcock WikiTalk: A Spoken Wikipedia-based Open-Domain Knowledge Access System , 2012, Coling 2012.

[9]  A. Kendon Gesture: Visible Action as Utterance , 2004 .

[10]  Kristiina Jokinen,et al.  Visual interaction and conversational activity , 2012, Gaze-In '12.

[11]  Emer Gilmartin,et al.  Multimodal conversational interaction with a humanoid robot , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[12]  Anton Nijholt,et al.  Development of Multimodal Interfaces: Active Listening and Synchrony, Second COST 2102 International Training School, Dublin, Ireland, March 23-27, 2009, Revised Selected Papers , 2010, COST 2102 Training School.

[13]  Nick Campbell,et al.  Investigating the use of Non-verbal Cues in Human-Robot Interaction with a Nao robot , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[14]  Kathleen F. McCoy,et al.  Focus of attention: Constraining what can be said next , 1991 .

[15]  Kristiina Jokinen,et al.  User expectations and real experience on a multimodal interactive system , 2006, INTERSPEECH.

[16]  M. Swerts,et al.  Prosody as a Marker of Information Flow in Spoken Discourse , 1994 .

[17]  Masafumi Nishida,et al.  Turn-alignment using eye-gaze and speech in conversational interaction , 2010, INTERSPEECH.

[18]  Kristiina Jokinen,et al.  Constructive Interaction for Talking about Interesting Topics , 2012, LREC.

[19]  J. Allwood Linguistic communication as action and cooperation : a study in pragmatics , 1976 .