A Framework for Recognizing the Simultaneous Aspects of American Sign Language

The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5×109, which cannot be tackled by conventional hidden Markov model-based methods. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. In this paper we present a novel framework to ASL recognition that aspires to being a solution to the scalability problems. It is based on breaking down the signs into their phonemes and modeling them with parallel hidden Markov models. These model the simultaneous aspects of ASL independently. Thus, they can be trained independently, and do not require consideration of the different combinations at training time. We show in experiments with a 22-sign-vocabulary how to apply this framework in practice. We also show that parallel hidden Markov models outperform conventional hidden Markov models.

[1]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[2]  Annelies Braffort ARGo: An Architecture for Sign Language Recognition and Interpretation , 1996, Gesture Workshop.

[3]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Dimitris N. Metaxas Physics-Based Deformable Models: Applications to Computer Vision, Graphics, and Medical Imaging , 1996 .

[5]  Sylvie Gibet,et al.  Corpus 3D Natural Movements and Sign Language Primitives of Movement , 1997, Gesture Workshop.

[6]  Hermann Hienz,et al.  HMM-Based Continuous Sign Language Recognition Using Stochastic Grammars , 1999, Gesture Workshop.

[7]  KwangYun Wohn,et al.  Recognition of space-time hand-gestures using hidden Markov model , 1996, VRST.

[8]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ceil Lucas,et al.  Sign Language Research: Theoretical Issues , 1990 .

[11]  Ceil Lucas,et al.  Linguistics of American Sign Language: An Introduction , 1995 .

[12]  W. Sandler Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language , 1989 .

[13]  Dimitris N. Metaxas Physics-Based Deformable Models , 1996 .

[14]  Scott K. Liddell,et al.  American Sign Language: The Phonological Base , 2013 .

[15]  C. Creider Hand and Mind: What Gestures Reveal about Thought , 1994 .

[16]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Ioannis A. Kakadiaris,et al.  Active part-decomposition, shape and motion estimation of articulated objects: a physics-based approach , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Misha Pavel,et al.  Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[19]  Kirsti Grobel,et al.  Video-Based Sign Language Recognition Using Hidden Markov Models , 1997, Gesture Workshop.

[20]  Geoffrey Restall Coulter Current issues in ASL phonology , 1993 .

[21]  Pavel Laskov,et al.  A MULTI-STAGE APPROACH TO FINGERSPELLING AND GESTURE RECOGNITION , 1996 .

[22]  Linda Uyechi,et al.  Current Issues in ASL Phonology: Phonetics and Phonology , 1994 .

[23]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[24]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Dimitris N. Metaxas,et al.  Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes , 1999, Gesture Workshop.

[26]  Ming Ouhyoung,et al.  A real-time continuous gesture recognition system for sign language , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[27]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[28]  Dimitris N. Metaxas,et al.  Adapting hidden Markov models for ASL recognition by using three-dimensional computer vision methods , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[29]  Mohammed Waleed Kadous,et al.  Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Lan , 1996 .

[30]  Hervé Bourlard,et al.  Subband-based speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Steve Young,et al.  Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[32]  W. Stokoe,et al.  Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[33]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[34]  M. B. Waldron,et al.  Isolated ASL sign recognition system for deaf persons , 1995 .

[35]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.