Intuitive Multimodal Interaction with Communication Robot Fritz

One of the most important motivations for many humanoid robot projects is that robots with a human-like body and human-like senses could in principle be capable of intuitive multimodal communication with people. The general idea is that by mimicking the way humans interact with each other, it will be possible to transfer the efficient and robust communication strategies that humans use in their interactions to the man-machine interface. This includes the use of multiple modalities, such as speech, facial expressions, gestures, body language, etc. If successful, this approach yields a user interface that leverages the evolution of human communication and that is intuitive to naive users, as they have practiced it since early childhood. We work towards intuitive multimodal communication in the domain of a museum guide robot. This application requires interacting with multiple unknown persons. The testing of communication robots in science museums and on science fairs is popular, because the robots encounter there many new interaction partners, which have a general interest in science and technology. Here, we present the humanoid communication robot Fritz that we developed as successor to the communication robot Alpha (Bennewitz et al., 2005). Fritz uses speech, facial expressions, eye-gaze, and gestures to interact with people. Depending on the audio-visual input, our robot shifts its attention between different persons in order to involve them into an interaction. He performs human-like arm gestures during the conversation and also uses pointing gestures generated with eyes, its head, and arms to direct the attention of its communication partners towards the explained exhibits. To express its emotional state, the robot generates facial expressions and adapts the speech synthesis. The remainder of the chapter is organized as follows. The next section reviews some of the related work. The mechanical and electrical design of Fritz is covered in Sec. 3. Sec. 4 details the perception of the human communication partners. Sec. 5 explains the robot's attentional system. The generation of arm gestures and of facial expressions is presented in Sec. 6 and 7, respectively. Finally, in the experimental section, we discuss experiences made during public demonstrations of our robot.

[1]  Sebastian Thrun,et al.  Spontaneous, short-term interaction with mobile robots , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[2]  Takashi Minato,et al.  Generating natural motion in an android by mapping human motion , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  D. McNeill Hand and Mind , 1995 .

[4]  Maurizio Omologo,et al.  Talker localization and speech recognition using a microphone array and a cross-powerspectrum phase analysis , 1994, ICSLP.

[5]  Kristinn R. Thórisson,et al.  Natural Turn-Taking Needs No Manual: Computational Theory and Model, from Perception to Action , 2002 .

[6]  Hans P. Moravec,et al.  High resolution maps from wide angle sonar , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[7]  Roland Siegwart,et al.  Improving the expressiveness of mobile robots , 2002, Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication.

[8]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .

[9]  Illah R. Nourbakhsh,et al.  An Affective Mobile Robot Educator with a Full-Time Job , 1999, Artif. Intell..

[10]  Jannik Fritsch,et al.  Human-like person tracking with an anthropomorphic robot , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[11]  Tetsunori Kobayashi,et al.  Modeling of conversational strategy for the robot participating in the group conversation , 2001, INTERSPEECH.

[12]  Hiroaki Kitano,et al.  Social Interaction of Humanoid RobotBased on Audio-Visual Tracking , 2002, IEA/AIE.

[13]  Sebastian Lang,et al.  Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot , 2003, ICMI '03.

[14]  Paul J. W. ten Hagen,et al.  Emotion Disc and Emotion Squares: Tools to Explore the Facial Expression Space , 2003, Comput. Graph. Forum.

[15]  Sven Behnke,et al.  Towards a humanoid museum guide robot that interacts with multiple persons , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[16]  Janet E. Cahn Generating expression in synthesized speech , 1989 .

[17]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[18]  Sven Behnke,et al.  NimbRo TeenSize 2006 Team Description , 2006 .

[19]  Cory D. Kidd,et al.  HUMANOID ROBOTS AS COOPERATIVE PARTNERS FOR PEOPLE , 2004 .