Integrating vision and speech for conversations with multiple persons

An essential capability for a robot designed to interact with humans is to show attention to the people in its surroundings. To enable a robot to involve multiple persons into interaction requires the maintenance of an accurate belief about the people in the environment. In this paper, we use a probabilistic technique to update the knowledge of the robot based on sensory input. In this way, the robot is able to reason about the uncertainty in its belief about people in the vicinity and is able to shift its attention between different persons. Even people who are not the primary conversational partners are included into the interaction. In practical experiments with a humanoid robot, we demonstrate the effectiveness of our approach.

[1]  Illah R. Nourbakhsh,et al.  The mobot museum robot installations: a five year experiment , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[2]  Sebastian Lang,et al.  Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot , 2003, ICMI '03.

[3]  Bernhard Fröba,et al.  Face Tracking by Means of Continuous Detection , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  Wolfram Burgard,et al.  Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva , 2000, Int. J. Robotics Res..

[5]  Cory D. Kidd,et al.  HUMANOID ROBOTS AS COOPERATIVE PARTNERS FOR PEOPLE , 2004 .

[6]  Tetsunori Kobayashi,et al.  Modeling of conversational strategy for the robot participating in the group conversation , 2001, INTERSPEECH.

[7]  Hiroaki Kitano,et al.  Social Interaction of Humanoid RobotBased on Audio-Visual Tracking , 2002, IEA/AIE.

[8]  Hans P. Moravec,et al.  High resolution maps from wide angle sonar , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[9]  Wolfram Burgard,et al.  Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[10]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[12]  Alexander H. Waibel,et al.  Natural human-robot interaction using speech, head pose and gestures , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Roland Siegwart,et al.  Robox at Expo.02: A large-scale installation of personal robots , 2003, Robotics Auton. Syst..

[15]  Tetsunori Kobayashi,et al.  A conversational robot utilizing facial and body expressions , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[16]  Sven Behnke,et al.  A Hierarchy of Reactive Behaviors Handles Complexity , 2000, Balancing Reactivity and Social Deliberation in Multi-Agent Systems.

[17]  Timothy F. Cootes,et al.  Learning to identify and track faces in image sequences , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[18]  Andreas Zell,et al.  Real-time face tracking using discriminator technique on standard PC hardware , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[19]  Maurizio Omologo,et al.  Talker localization and speech recognition using a microphone array and a cross-powerspectrum phase analysis , 1994, ICSLP.

[20]  Illah R. Nourbakhsh,et al.  The role of expressiveness and attention in human-robot interaction , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[21]  Jannik Fritsch,et al.  "BIRON, let me show you something": evaluating the interaction with a robot companion , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[22]  Brian Scassellati,et al.  Active vision for sociable robots , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[23]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[24]  Alexander H. Waibel,et al.  A real-time face tracker , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[25]  Paulo Menezes,et al.  Face tracking and hand gesture recognition for human-robot interaction , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.