Model-based segmentation and recognition of dynamic gestures in continuous video streams

Segmentation and recognition of continuous gestures are challenging due to spatio-temporal variations and endpoint localization issues. A novel multi-scale Gesture Model is presented here as a set of 3D spatio-temporal surfaces of a time-varying contour. Three approaches, which differ mainly in endpoint localization, are proposed: the first uses a motion detection strategy and multi-scale search to find the endpoints; the second uses Dynamic Time Warping to roughly locate the endpoints before a fine search is carried out; the last approach is based on Dynamic Programming. Experimental results on two arm and single hand gestures show that all three methods achieve high recognition rates, ranging from 88% to 96% for the two arm test, with the last method performing best.

[1]  Keiichi Abe,et al.  Topological structural analysis of digitized binary images by border following , 1985, Comput. Vis. Graph. Image Process..

[2]  Marcel J. T. Reinders,et al.  Sign Language Recognition by Combining Statistical DTW and Independent Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ming Ouhyoung,et al.  A real-time continuous gesture recognition system for sign language , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[4]  Keechul Jung,et al.  Recognition-based gesture spotting in video games , 2004, Pattern Recognit. Lett..

[5]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[6]  David J. Kriegman,et al.  A Real-Time Approach to the Spotting, Representation, and Recognition of Hand Gestures for Human-Computer Interaction , 2002, Comput. Vis. Image Underst..

[7]  Lawrence R. Rabiner,et al.  Connected digit recognition using a level-building DTW algorithm , 1981 .

[8]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[9]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ryuichi Oka Spotting Method for Classification of Real World Data , 1998, Comput. J..

[11]  Peter Morguet,et al.  Spotting dynamic hand gestures in video image sequences using hidden Markov models , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[12]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  Ho-Sub Yoon,et al.  Hand gesture recognition using combined features of location, angle and velocity , 2001, Pattern Recognit..

[14]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  William J. Christmas,et al.  Gesture spotting for low-resolution sports video annotation , 2008, Pattern Recognit..

[16]  John K. Tsotsos,et al.  Hand Gesture Recognition within a Linguistics-Based Framework , 2004, ECCV.

[17]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  A. Corradini,et al.  Dynamic time warping for off-line recognition of a small gesture vocabulary , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[19]  Stan Sclaroff,et al.  Accurate and Efficient Gesture Spotting via Pruning and Subgesture Reasoning , 2005, ICCV-HCI.

[20]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[21]  Sudeep Sarkar,et al.  Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes , 2009, CVPR 2009.

[22]  Ruiduo Yang,et al.  Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Hong Li,et al.  Segmentation and Recognition of Continuous Gestures , 2007, 2007 IEEE International Conference on Image Processing.

[24]  Hermann Hienz,et al.  Relevant features for video-based continuous sign language recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[25]  Ruiduo Yang,et al.  Handling Movement Epenthesis and Hand Segmentation Ambiguities in Continuous Sign Language Recognition Using Nested Dynamic Programming , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[27]  D.P. Morgan,et al.  The application of dynamic programming to connected speech recognition , 1990, IEEE ASSP Magazine.

[28]  Michael Unser,et al.  Optimization of mutual information for multiresolution image registration , 2000, IEEE Trans. Image Process..

[29]  Luciano da Fontoura Costa,et al.  Shape Analysis and Classification: Theory and Practice , 2000 .

[30]  Qiang Wu,et al.  Using dynamic programming to match human behavior sequences , 2008, 2008 10th International Conference on Control, Automation, Robotics and Vision.

[31]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Daijin Kim,et al.  Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[33]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  P. Peixoto,et al.  Real-time gesture recognition system based on contour signatures , 2002, Object recognition supported by user interaction for service robots.

[35]  Maurice Milgram,et al.  Recognition of human behavior by space-time silhouette characterization , 2008, Pattern Recognit. Lett..

[36]  Hong Li,et al.  Multi-scale gesture recognition from time-varying contours , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[37]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[38]  Wen Gao,et al.  Large-Vocabulary Continuous Sign Language Recognition Based on Transition-Movement Models , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[39]  Karl-Friedrich Kraiss,et al.  Video-based sign recognition using self-organizing subunits , 2002, Object recognition supported by user interaction for service robots.

[40]  Max A. Viergever,et al.  Mutual-information-based registration of medical images: a survey , 2003, IEEE Transactions on Medical Imaging.

[41]  Stan Sclaroff,et al.  A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[44]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..