Multiple Parallel Vision-Based Recognition in a Real-Time Framework for Human-Robot-Interaction Scenarios

Every day human communication relies on a large number of different communication mechanisms like spoken language, facial expressions, body pose and gestures, allowing humans to pass large amounts of information in short time. In contrast, traditional human-machine communication is often unintuitive and requires specifically trained personal. In this paper, we present a real-time capable framework that recognizes traditional visual human communication signals in order to establish a more intuitive human-machine interaction. Humans rely on the interaction partner’s face for identification, which helps them to adapt to the interaction partner and utilize context information. Head gestures (head nodding and head shaking) are a convenient way to show agreement or disagreement. Facial expressions give evidence about the interaction partners’ emotional state and hand gestures are a fast way of passing simple commands. The recognition of all interaction queues is performed in parallel, enabled by a shared memory implementation.

[1]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[2]  Jasia Reichardt Robots: Fact, Fiction and Prediction , 1978 .

[3]  Martin Buss,et al.  EDDIE - An Emotion-Display with Dynamic Intuitive Expressions , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[4]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Mathias Kölsch,et al.  Fast 2D Hand Tracking with Flocks of Features and Multi-Cue Integration , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  P. Ekman,et al.  PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Universals and Cultural Differences in the Judgments of Facial Expressions of Emotion , 2004 .

[8]  Rainer Stiefelhagen,et al.  Visual recognition of pointing gestures for human-robot interaction , 2007, Image Vis. Comput..

[9]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[10]  Martin Buss,et al.  Design and Evaluation of Emotion-Display EDDIE , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Nico Blodow,et al.  The Assistive Kitchen — A demonstration scenario for cognitive technical systems , 2007, RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication.

[12]  Jörgen Ahlberg,et al.  CANDIDE-3 - An Updated Parameterised Face , 2001 .

[13]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[14]  Cynthia Breazeal,et al.  Designing sociable robots , 2002 .

[15]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[16]  Matthias Wimmer,et al.  Adjusted Pixel Features for Facial Component Classification , 2009 .

[17]  Ioannis Pitas,et al.  Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines , 2007, IEEE Transactions on Image Processing.

[18]  Fernando De la Torre,et al.  Temporal Segmentation of Facial Behavior , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[20]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[21]  T. Tsuji,et al.  Development of the Face Robot SAYA for Rich Facial Expressions , 2006, 2006 SICE-ICASE International Joint Conference.

[22]  M. Goebl,et al.  A Real-Time-capable Hard-and Software Architecture for Joint Image and Knowledge Processing in Cognitive Automobiles , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[23]  Trevor Darrell,et al.  Head gesture recognition in intelligent interfaces: the role of context in improving recognition , 2006, IUI '06.

[24]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[25]  Jun-Hyeong Do,et al.  Advanced Soft Remote Control System Using Hand Gesture , 2006, MICAI.

[26]  Bernd Radig,et al.  Learning Local Objective Functions for Robust Face Model Fitting , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[28]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[29]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[30]  J. Russell,et al.  An approach to environmental psychology , 1974 .

[31]  J. Russell A circumplex model of affect. , 1980 .

[32]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Timothy F. Cootes,et al.  Face Recognition Using Active Appearance Models , 1998, ECCV.

[34]  Giulio Sandini,et al.  A Survey of Artificial Cognitive Systems: Implications for the Autonomous Development of Mental Capabilities in Computational Agents , 2007, IEEE Transactions on Evolutionary Computation.

[35]  Larry S. Davis,et al.  Recognition of head gestures using hidden Markov models , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[36]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.