Definition and recovery of kinematic features for recognition of American sign language movements

An approach to recognizing human hand gestures from a monocular temporal sequence of images is presented. Of concern is the representation and recognition of hand movements that are used in single-handed American sign language (ASL). The approach exploits previous linguistic analysis of manual languages that decompose dynamic gestures into their static and dynamic components. The first level of decomposition is in terms of three sets of primitives, hand shape, location and movement. Further levels of decomposition involve the lexical and sentence levels and are beyond the scope of the present paper. We propose and subsequently demonstrate that given a monocular gesture sequence, kinematic features can be recovered from the apparent motion that provide distinctive signatures for 14 primitive movements of ASL. The approach has been implemented in software and evaluated on a database of 592 gesture sequences with an overall recognition rate of 86% for fully automated processing and 97% for manually initialized processing.

[1]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Patrick Bouthemy,et al.  A unified approach to shot change detection and camera motion characterization , 1999, IEEE Trans. Circuits Syst. Video Technol..

[3]  Chung-Lin Huang,et al.  A model-based hand gesture recognition system , 2001, Machine Vision and Applications.

[4]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[5]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[6]  Thomas S. Huang,et al.  Gesture modeling and recognition using finite state machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  Ceil Lucas,et al.  Linguistics of American Sign Language: An Introduction , 1995 .

[8]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Steven S. Beauchemin,et al.  The computation of optical flow , 1995, CSUR.

[10]  Roberto Cipolla,et al.  Real-Time Adaptive Hand Motion Recognition Using a Sparse Bayesian Classifier , 2005, ICCV-HCI.

[11]  H. C. Longuet-Higgins,et al.  The interpretation of a moving retinal image , 1980, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[12]  U. Bellugi,et al.  Perception of American sign language in dynamic point-light displays. , 1981, Journal of experimental psychology. Human perception and performance.

[13]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Norman I. Badler,et al.  Temporal scene analysis: conceptual descriptions of object movements. , 1975 .

[15]  Allen M. Waxman,et al.  Surface Structure and Three-Dimensional Motion from Image Flow Kinematics , 1985 .

[16]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[19]  David Alan Becker,et al.  Sensei, a real-time recognition, feedback and training system for T'ai chi gestures , 1997 .

[20]  Geoffrey E. Hinton,et al.  Glove-TalkII-a neural-network interface which maps gestures to parallel formant speech synthesizer controls , 1997, IEEE Trans. Neural Networks.

[21]  P. J. Huber Robust Statistical Procedures , 1977 .

[22]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[23]  Wen Gao,et al.  A SRN/HMM system for signer-independent continuous sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[24]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[25]  R. Aris Vectors, Tensors and the Basic Equations of Fluid Mechanics , 1962 .

[26]  Roberto Cipolla,et al.  Real-time Interpretation of Hand Motions using a Sparse Bayesian Classifier on Motion Gradient Orientation Images , 2005, BMVC.

[27]  Patrick Bouthemy,et al.  Region-Based Tracking Using Affine Motion Models in Long Image Sequences , 1994 .

[28]  Ming Ouhyoung,et al.  A real-time continuous gesture recognition system for sign language , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[29]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[30]  Takeo Kanade,et al.  A Paraperspective Factorization Method for Shape and Motion Recovery , 1994, ECCV.

[31]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[33]  John K. Tsotsos,et al.  Hand Gesture Recognition within a Linguistics-Based Framework , 2004, ECCV.

[34]  Matthew Turk,et al.  View-based interpretation of real-time optical flow for gesture recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[35]  Mubarak Shah,et al.  Visual Recognition of Activities, Gestures, Facial Expressions and Speech: An Introduction and a Perspective , 1997 .

[36]  Stéphane Christy,et al.  Euclidean Shape and Motion from Multiple Perspective Views by Affine Iterations , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Yoshiaki Shirai,et al.  Extraction of Hand Features for Recognition of Sign Language Words , 2002 .

[38]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[39]  Shan Lu,et al.  Using multiple cues for hand tracking and model refinement , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[40]  Larry S. Shapiro,et al.  Affine Analysis of Image Sequences: Contents , 1995 .

[41]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[42]  Larry S. Davis,et al.  Learning dynamics for exemplar-based gesture recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[43]  Mubarak Shah,et al.  Toward 3-D Gesture Recognition , 1999, Int. J. Pattern Recognit. Artif. Intell..

[44]  Jan J. Koenderink,et al.  Local structure of movement parallax of the plane , 1976 .

[45]  Hyung Lee-Kwang,et al.  Modeling and recognition of hand gesture using colored Petri nets , 1999, IEEE Trans. Syst. Man Cybern. Part A.

[46]  David Windridge,et al.  A Linguistic Feature Vector for the Visual Interpretation of Sign Language , 2004, ECCV.

[47]  M. B. Waldron,et al.  Isolated ASL sign recognition system for deaf persons , 1995 .

[48]  Roberto Cipolla,et al.  Robust Egomotion Estimation from Affine Motion Parallax , 1994, ECCV.

[49]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[50]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[51]  John K. Tsotsos,et al.  A framework for visual motion understanding , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Paulo R. S. Mendonça,et al.  Model-Based Hand Tracking Using an Unscented Kalman Filter , 2001, BMVC.

[53]  Kirsti Grobel,et al.  Isolated sign language recognition using hidden Markov models , 1996, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[54]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Hermann Hienz,et al.  Relevant features for video-based continuous sign language recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[56]  Hans-Hellmut Nagel,et al.  Association of Motion Verbs with Vehicle Movements Extracted from Dense Optical Flow Fields , 1994, ECCV.

[57]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[58]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[59]  Geoffrey E. Hinton,et al.  Glove-talk II - a neural-network interface which maps gestures to parallel formant speech synthesizer controls , 1997, IEEE Trans. Neural Networks.

[60]  Mohammed Yeasin,et al.  Visual understanding of dynamic hand gestures , 2000, Pattern Recognit..

[61]  John K. Tsotsos,et al.  SAVI: an actively controlled teleconferencing system , 2001, Image Vis. Comput..

[62]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[63]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Xueyin Lin,et al.  Toward real-time human-computer interaction with continuous dynamic hand gestures , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[65]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  Pooja Mittal,et al.  Developing a Gesture-based Interface , 2002 .

[67]  S. Negahdaripour,et al.  Motion recovery from image sequences using First-order optical flow information , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[68]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[69]  Bernd Jähne,et al.  Digital Image Processing: Concepts, Algorithms, and Scientific Applications , 1991 .

[70]  Edward Hunter,et al.  Vision based hand gesture interpretation using recursive estimation , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[71]  Subhasis Chaudhuri,et al.  A Two-stage Scheme for Dynamic Hand Gesture Recognition , 2002 .

[72]  Robyn A. Owens,et al.  Automatic Recognition of Colloquial Australian Sign Language , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.