Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes

In this paper we present a novel approach to continuous, whole-sentence ASL recognition that uses phonemes instead of whole signs as the basic units. Our approach is based on a sequential phonological model of ASL. According to this model the ASL signs can be broken into movements and holds, which are both considered phonemes.This model does away with the distinction between whole signs and epenthesis movements that we made in previous work [17]. Instead, epenthesis movements are just like the other movements that constitute the signs.We subsequently train Hidden Markov Models (HMMs) to recognize the phonemes, instead of whole signs and epenthesis movements that we recognized previously [17]. Because the number of phonemes is limited, HMM-based training and recognition of the ASL signal becomes computationally more tractable and has the potential to lead to the recognition of large-scale vocabularies.We experimented with a 22 word vocabulary, and we achieved similar recognition rates with phoneme-and word-based approaches. This result is very promising for scaling the task in the future.

[1]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[2]  W. Stokoe,et al.  Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[3]  Annelies Braffort ARGo: An Architecture for Sign Language Recognition and Interpretation , 1996, Gesture Workshop.

[4]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Kirsti Grobel,et al.  Isolated sign language recognition using hidden Markov models , 1996, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  Scott K. Liddell,et al.  American Sign Language: The Phonological Base , 2013 .

[8]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[9]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Pavel Laskov,et al.  A MULTI-STAGE APPROACH TO FINGERSPELLING AND GESTURE RECOGNITION , 1996 .

[11]  Dimitris N. Metaxas,et al.  ASL recognition based on a coupling between HMMs and 3D motion analysis , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  M. B. Waldron,et al.  Isolated ASL sign recognition system for deaf persons , 1995 .

[13]  W. Sandler Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language , 1989 .

[14]  Dimitris N. Metaxas,et al.  Adapting hidden Markov models for ASL recognition by using three-dimensional computer vision methods , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[15]  Mohammed Waleed Kadous,et al.  Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Lan , 1996 .

[16]  Sylvie Gibet,et al.  Corpus 3D Natural Movements and Sign Language Primitives of Movement , 1997, Gesture Workshop.

[17]  KwangYun Wohn,et al.  Recognition of space-time hand-gestures using hidden Markov model , 1996, VRST.

[18]  Ming Ouhyoung,et al.  A real-time continuous gesture recognition system for sign language , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[19]  Yh Nam Yanghee Nam,et al.  Hidden Markov Model Based Recognition of Space-Time Hand Gestures for Human-Computer Interaction , 1996 .