Sign Language Phoneme Transcription with Rule-based Hand Trajectory Segmentation

A common approach to extract phonemes of sign language is to use an unsupervised clustering algorithm to group the sign segments. However, simple clustering algorithms based on distance measures usually do not work well on temporal data and require complex algorithms. In this paper, we present a simple and effective approach to extract phonemes from American sign language sentences. We first apply a rule-based segmentation algorithm to segment the hand motion trajectories of signed sentences. We then extract feature descriptors based on principal component analysis to represent the segments efficiently. The segments are clustered by k-means using these high level features to derive phonemes. 25 different continuously signed sentences from a deaf signer were used to perform the analysis. After phoneme transcription, we trained Hidden Markov Models to recognize the sequence of phonemes in the sentences. Overall, our automatic approach yielded 165 segments, and 58 phonemes were obtained based on these segments. The average number of recognition errors was 18.8 (11.4%). In comparison, completely manual trajectory segmentation and phoneme transcription, involving considerable labor yielded 173 segments, 57 phonemes, and the average number of recognition errors was 33.8 (19.5%).

[1]  Sylvie Gibet,et al.  Approximation of curvature and velocity using adaptive sampling representations - Application to hand gesture analysis , 2007 .

[2]  Wen Gao,et al.  An approach based on phonemes to large vocabulary Chinese sign language recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[3]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[4]  Shaogang Gong,et al.  Auto clustering for unsupervised learning of atomic gesture components using minimum description length , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[5]  Scott K. Liddell,et al.  American Sign Language: The Phonological Base , 2013 .

[6]  Masaru Takeuchi,et al.  A method for recognizing a sequence of sign language words represented in a Japanese sign language sentence , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  Dimitris N. Metaxas,et al.  Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes , 1999, Gesture Workshop.

[8]  Lawrence R. Rabiner,et al.  A modified K-means clustering algorithm for use in isolated work recognition , 1985, IEEE Trans. Acoust. Speech Signal Process..

[9]  KwangYun Wohn,et al.  Recognition of space-time hand-gestures using hidden Markov model , 1996, VRST.

[10]  Nanning Zheng,et al.  Unsupervised Analysis of Human Gestures , 2001, IEEE Pacific Rim Conference on Multimedia.

[11]  D. B. Fry,et al.  Theoretical aspects of mechanical speech recognition , 1959 .

[12]  Jeff A. Bilmes,et al.  What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14]  Wen Gao,et al.  A novel approach to automatically extracting basic units from Chinese sign language , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[15]  Michael Brady,et al.  The Curvature Primal Sketch , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Wen Gao,et al.  An approach to automatically extracting the basic units in Chinese sign language recognition , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[17]  Karl-Friedrich Kraiss,et al.  Towards an Automatic Sign Language Recognition System Using Subunits , 2001, Gesture Workshop.

[18]  P. Denes,et al.  The design and operation of the mechanical speech recognizer at University College London , 1959 .