Recognizing complex, parameterized gestures from monocular image sequences

Robotic assistants designed to coexist and communicate with humans in the real world should be able to interact with them in an intuitive way. This requires that the robots are able to recognize typical gestures performed by humans such as head shaking/nodding, hand waving, or pointing. In this paper, we present a system that is able to spot and recognize complex, parameterized gestures from monocular image sequences. To represent people, we locate their faces and hands using trained classifiers and track them over time. We use few, expressive features extracted out of this compact representation as input to hidden Markov models (HMMs). First, we segment gestures into distinct phases and train HMMs for each phase separately. Then, we construct composed HMMs, which consist of the individual phase-HMMs. Once a specific phase is recognized, we estimate the parameter of the current gesture, e.g., the target of a pointing gesture. As we demonstrate in the experiments, our method is able to robustly locate and track hands, despite of the fact that they can take a large number of substantially different shapes. Based on this, our system is able to reliably spot and recognize a variety of complex, parameterized gestures.

[1]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  N.D. Georganas,et al.  Real-time Vision-based Hand Gesture Recognition Using Haar-like Features , 2007, 2007 IEEE Instrumentation & Measurement Technology Conference IMTC 2007.

[4]  Sven Behnke,et al.  Fritz - A Humanoid Communication Robot , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[5]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[6]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[7]  Seong-Whan Lee Automatic gesture recognition for intelligent human-robot interaction , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[8]  Gerhard Rigoll,et al.  High Performance Real-Time Gesture Recognition Using Hidden Markov Models , 1997, Gesture Workshop.

[9]  Sven Behnke,et al.  Integrating vision and speech for conversations with multiple persons , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Aude Billard,et al.  Stochastic gesture production and recognition model for a humanoid robot , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[11]  Mathias Kölsch,et al.  Robust hand detection , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[12]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[13]  Agnès Just,et al.  HMM and IOHMM for the Recognition of Mono- and Bi-Manual 3D Hand Gestures , 2004, BMVC.

[14]  Horst-Michael Groß,et al.  Estimation of Pointing Poses for Visual Instructing Mobile Robots under Real World Conditions , 2010, EMCR.

[15]  Luis Enrique Sucar,et al.  Feature selection for visual gesture recognition using hidden Markov models , 2004, Proceedings of the Fifth Mexican International Conference in Computer Science, 2004. ENC 2004..

[16]  Sven Behnke,et al.  Feature-based head pose estimation from images , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[17]  Tamim Asfour,et al.  Imitation Learning of Dual-Arm Manipulation Tasks in Humanoid Robots , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[18]  Rainer Stiefelhagen,et al.  3D-tracking of head and hands for pointing gesture recognition in a human-robot interaction scenario , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[19]  Kota Irie,et al.  Construction of an Intelligent Room Based on Gesture Recognition , 2007 .

[20]  Jannik Fritsch,et al.  Hierarchical Modeling and Recognition of Manipulative Gesture , 2005, ICCV 2005.

[21]  Richard Bowden,et al.  A boosted classifier tree for hand shape detection , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[22]  Seong-Whan Lee,et al.  Robust Spotting of Key Gestures from Whole Body Motion Sequence , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[23]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Yoshihiko Nakamura,et al.  Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..