Towards subject independent continuous sign language recognition: A segment and merge approach

This paper presents a segment-based probabilistic approach to robustly recognize continuous sign language sentences. The recognition strategy is based on a two-layer conditional random field (CRF) model, where the lower layer processes the component channels and provides outputs to the upper layer for sign recognition. The continuously signed sentences are first segmented, and the sub-segments are labeled SIGN or ME (movement epenthesis) by a Bayesian network (BN) which fuses the outputs of independent CRF and support vector machine (SVM) classifiers. The sub-segments labeled as ME are discarded and the remaining SIGN sub-segments are merged and recognized by the two-layer CRF classifier; for this we have proposed a new algorithm based on the semi-Markov CRF decoding scheme. With eight signers, we obtained a recall rate of 95.7% and a precision of 96.6% for unseen samples from seen signers, and a recall rate of 86.6% and a precision of 89.9% for unseen signers. HighlightsVariations in sign language are examined to develop a signer independent system.A 4-channel phoneme-based approach is used.Continuous sentence is segmented into sign or movement epenthesis sub-segments.Sign sub-segments are merged and recognized with a two-layer CRF.Novel decoding scheme is proposed for the semi-Markov CRF used in the 2-layer CRF.

[1]  Surendra Ranganath,et al.  Automatic hand trajectory segmentation and phoneme transcription for sign language , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[2]  Surendra Ranganath,et al.  Sign Language Phoneme Transcription with Rule-based Hand Trajectory Segmentation , 2010, J. Signal Process. Syst..

[3]  Ali Farhadi,et al.  Transfer Learning in Sign language , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ruiduo Yang,et al.  Handling Movement Epenthesis and Hand Segmentation Ambiguities in Continuous Sign Language Recognition Using Nested Dynamic Programming , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Wen Gao,et al.  A SRN/HMM system for signer-independent continuous sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[7]  Wen Gao,et al.  A Real-Time Large Vocabulary Continuous Recognition System for Chinese Sign Language , 2001, IEEE Pacific Rim Conference on Multimedia.

[8]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[9]  Robert Bayley,et al.  What's Your Sign for Pizza?: An Introduction to Variation in American Sign Language , 2003 .

[10]  Wen Gao,et al.  An approach based on phonemes to large vocabulary Chinese sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[11]  Surendra Ranganath,et al.  Deciphering gestures with layered meanings and signer adaptation , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[12]  Dimitris N. Metaxas,et al.  A framework for motion recognition with applications to American sign language and gait recognition , 2000, Proceedings Workshop on Human Motion.

[13]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[14]  Seong-Whan Lee,et al.  Robust Sign Language Recognition with Hierarchical Conditional Random Fields , 2010, 2010 20th International Conference on Pattern Recognition.

[15]  Heung-Il Suk,et al.  Real-time human-robot interaction based on continuous gesture spotting and recognition , 2008 .

[16]  Daniel Kelly,et al.  Continuous recognition of motion based gestures in sign language , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[17]  Seong-Whan Lee,et al.  Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings , 2010, Pattern Recognit..

[18]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[19]  Dimitris N. Metaxas,et al.  Handshapes and Movements: Multiple-Channel American Sign Language Recognition , 2003, Gesture Workshop.

[20]  Karl-Friedrich Kraiss,et al.  Rapid signer adaptation for continuous sign language recognition using a combined approach of eigenvoices, MLLR, and MAP , 2008, 2008 19th International Conference on Pattern Recognition.

[21]  Karl-Friedrich Kraiss,et al.  Towards an Automatic Sign Language Recognition System Using Subunits , 2001, Gesture Workshop.

[22]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Wen Gao,et al.  Signer-Independent Continuous Sign Language Recognition Based on SRN/HMM , 2001, Gesture Workshop.

[24]  C. Myers,et al.  A level building dynamic time warping algorithm for connected word recognition , 1981 .

[25]  Daniel Kelly,et al.  Recognizing Spatiotemporal Gestures and Movement Epenthesis in Sign Language , 2009, 2009 13th International Machine Vision and Image Processing Conference.

[26]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[27]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[28]  Ruiduo Yang,et al.  Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Hermann Hienz,et al.  Relevant features for video-based continuous sign language recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[30]  Daniel Schneider,et al.  Rapid Signer Adaptation for Isolated Sign Language Recognition , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[31]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[32]  Lale Akarun,et al.  A multi-class classification strategy for Fisher scores: Application to signer independent sign language recognition , 2010, Pattern Recognit..