Proposal and Evaluation of a Head Tilting Generation Method for Humanoid Communication Robot

A suitable control of head motion in robots synchronized with its utterances is important for having a smooth human-robot interaction. Based on rules inferred from analyses of the relationship between head motion and dialog acts, this paper proposes a model for generating head tilting and evaluates the model using different types of humanoid robots. Analysis of subjective scores showed that the proposed model can generate head motion with increased naturalness compared to nodding only or directly mapping people ’s original motions without gaze information. We also evaluate the proposed model in a real human-robot interaction, by conducting an experiment in which participants act as visitors to an information desk attended by robots. The effects of gazing control were also taken into account when mapping the original motion to the robot. Evaluation results indicated that the proposed model performs equally to directly mapping people ’s original motion with gaze information, in terms of perceived naturalness

[1]  Hiroshi Ishiguro,et al.  Speech-driven lip motion generation for tele-operated humanoid robots , 2011, AVSP.

[2]  Hiroshi Ishiguro,et al.  Head motion during dialogue speech and nod timing control in humanoid robots , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[3]  Hiroshi Ishiguro,et al.  Analysis of head motions and speech, and head motion control in an android , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Jon Oberlander,et al.  Corpus-based generation of head and eyebrow motion for an embodied conversational agent , 2007, Lang. Resour. Evaluation.

[5]  Trevor Darrell,et al.  Head gestures for perceptual interfaces: The role of context in improving recognition , 2007, Artif. Intell..

[6]  Zhigang Deng,et al.  Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  A. Murat Tekalp,et al.  Combined Gesture-Speech Analysis and Speech Driven Gesture Synthesis , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[8]  Louis-Philippe Morency,et al.  The effect of head-nod recognition in human-robot conversation , 2006, HRI '06.

[9]  Minoru Asada,et al.  Learning for joint attention helped by functional development , 2006, Adv. Robotics.

[10]  Björn Granström,et al.  Visual correlates to prominence in several expressive modes , 2006, INTERSPEECH.

[11]  Hiroshi Ishiguro,et al.  Analysis of prosodic and linguistic cues of phrase finals for turn-taking and dialog acts , 2006, INTERSPEECH.

[12]  Jeffery A. Jones,et al.  Visual Prosody and Speech Intelligibility , 2004, Psychological science.

[13]  Nicole Shechtman,et al.  Media inequality in conversation: how people behave differently when interacting with computers and people , 2003, CHI '03.

[14]  Takaaki Kuratate,et al.  Linking facial animation, head motion and speech acoustics , 2002, J. Phonetics.

[15]  Volker Strom,et al.  Visual prosody: facial movements accompanying speech , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[16]  Mervyn A. Jack,et al.  Evaluating humanoid synthetic agents in e-retail applications , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[17]  V. Bruce,et al.  Do the eyes have it? Cues to the direction of social attention , 2000, Trends in Cognitive Sciences.

[18]  A. Boxer,et al.  Signal functions of infant facial expression and gaze direction during mother-infant face-to-face play. , 1979, Child development.