Unsupervised Modeling of Signs Embedded in Continuous Sentences

The common practice in sign language recognition is to ?rst construct individual sign models, in terms of discrete state transitions, mostly represented using Hidden Markov Models, from manually isolated sign samples and then to use it to recognize signs in continuous sentences. In this paper we (i) propose a continuous state space model, where the states are based on purely image-based features, without the use of special gloves, and (ii) present an unsupervised approach to both extract and learn models for continuous basic units of signs, which we term as signemes, from continuous sentences. Given a set of sentences with a common sign, we can automatically learn the model for part of the sign, or signeme, that is least affected by coarticulation effects. While there are coarticulation effects in speech recognition, these effects are even stronger in sign language. The model itself is in term of traces in a space of Relational Distributions. Each point in this space represents a Relational Distribution, capturing the spatial relationships between low-level features, such as edge points. We perform speed normalization and then incrementally extract the common sign between sentences, or signemes, with a dynamic programming framework at the core to compute warped distance between two subsentences. We test our idea using the publicly available Boston SignStream Dataset by building signeme models of 18 signs. We test the quality of the models by considering how well we can localize the sign in a new sentence. We also present preliminary results for the ability to generalize across signers.

[1]  Wen Gao,et al.  A continuous Chinese sign language recognition system , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[4]  Carol Neidle,et al.  SignStream™: A tool for linguistic research on signed languages , 1998 .

[5]  Wen Gao,et al.  An approach based on phonemes to large vocabulary Chinese sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[6]  Haiyuan Wu,et al.  Improvement of continuous dynamic programming for human gesture recognition , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[7]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[8]  Sudeep Sarkar,et al.  Statistical Motion Model Based on the Change of Feature Relationships: Human Gait-Based Recognition , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Wen Gao,et al.  A novel approach to automatically extracting basic units from Chinese sign language , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[11]  Karl-Friedrich Kraiss,et al.  Towards an Automatic Sign Language Recognition System Using Subunits , 2001, Gesture Workshop.

[12]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .