LDCRFs-based hand gesture recognition

This paper proposes a system to recognize isolated American Sign Language and numbers in real-time from Bumblebee stereo camera using Latent-Dynamic Conditional Random Fields (LDCRFs). Our system is based on three main stages: preprocessing, feature extraction and classification. In preprocessing stage, color and 3D depth map are used to detect and track the hand. The second stage, combining features of location, orientation and velocity with respected to Polar systems are used. The depth information is to identify the region of interest and consequently reduces the cost of searching and increases the processing speed. In the final stage, the hand gesture path is recognized using LDCRFs, which are more restricted to the number of hidden states owned by each class label to make training and inferencing processes tractable. Experimental results demonstrate that, our system can successfully recognize gestures with 96.14% recognition rate. Such results have the potential to compare very favorably to those of other investigators published in the literature.

[1]  W. Stokoe,et al.  Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[5]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Hanna M. Wallach,et al.  Conditional Random Fields: An Introduction , 2004 .

[7]  Nianjun Liu,et al.  Model structure selection & training algorithms for an HMM gesture recognition system , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[8]  Toshiaki Ejima,et al.  Real-Time Hand Tracking and Gesture Recognition System , 2005 .

[9]  W. Stokoe Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[10]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Ayoub Al-Hamadi,et al.  A Hidden Markov Model-Based Isolated and Meaningful Hand Gesture Recognition , 2008 .

[12]  Ayoub Al-Hamadi,et al.  Real-Time Capable System for Hand Gesture Recognition Using Hidden Markov Models in Stereo Color Image Sequences , 2008, J. WSCG.

[13]  A. Al-Hamadi,et al.  Spatio-temporal feature extraction-based hand gesture recognition for isolated American Sign Language and Arabic numbers , 2009, 2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis.

[14]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Ayoub Al-Hamadi,et al.  A Robust Method for Hand Tracking Using Mean-shift Algorithm and Kalman Filter in Stereo Color Image Sequences , 2009 .