Sign language spotting based on semi-Markov Conditional Random Field

Sign language spotting is the task of detecting the start and end points of signs from continuous data and recognizing the detected signs in the predefined vocabulary. The difficulty with sign language spotting is that instances of signs vary in terms of both motion and shape. Moreover, signs have variable motion in terms of both trajectory and length. Especially, variable sign lengths result in problems with spotting signs in a video sequence, because short signs involve less information and fewer changes than long signs. In this paper, we propose a method for spotting variable lengths signs based on semi-CRF (semi-Markov Conditional Random Field). We performed experiments with ASL (American Sign Language) and KSL (Korean Sign Language) datasets of continuous sign sentences to demonstrate the efficiency of the proposed method. Experimental results showed that the proposed method outperforms both HMM and CRF.

[1]  Wen Gao,et al.  Transition movement models for large vocabulary continuous sign language recognition , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[2]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[3]  Stan Sclaroff,et al.  Spatiotemporal gesture segmentation , 2006 .

[4]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[5]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  David Windridge,et al.  A Linguistic Feature Vector for the Visual Interpretation of Sign Language , 2004, ECCV.

[8]  Brigitte Dorner,et al.  CHASING THE COLOUR GLOVE: VISUAL HAND TRACKING , 1994 .

[9]  Stan Sclaroff,et al.  Estimation and prediction of evolving color distributions for skin segmentation under varying illumination , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10]  Stan Sclaroff,et al.  A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Deyou Xu A Neural Network Approach for Hand Gesture Recognition in Virtual Reality Driving Training System of SPG , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[12]  Shaogang Gong,et al.  Fusion of perceptual cues for robust tracking of head pose and position , 2001, Pattern Recognit..

[13]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[14]  Paul Lukowicz,et al.  Using multiple sensors for mobile sign language recognition , 2003, Seventh IEEE International Symposium on Wearable Computers, 2003. Proceedings..

[15]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Narendra Ahuja,et al.  Recognizing hand gesture using motion trajectories , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[17]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .