Visual Recognition of American Sign Language Using Hidden Markov Models.

Abstract : Using hidden Markov models (HMM's), an unobstrusive single view camera system is developed that can recognize hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have concentrated on finger spelling or isolated word recognition, often using tethered electronic gloves for input. We achieve high recognition rates for full sentence ASL using only visual cues. A forty word lexicon consisting of personal pronouns, verbs, nouns, and adjectives is used to create 494 randomly constructed five word sentences that are signed by the subject to the computer. The data is separated into a 395 sentence training set and an independent 99 sentence test set. While signing, the 2D position, orientation, and eccentricity of bounding ellipses of the hands are tracked in real time with the assistance of solidly colored gloves. Simultaneous recognition and segmentation of the resultant stream of feature vectors occurs five times faster than real time on an HP 735. With a strong grammar, the system achieves an accuracy of 97%; with no grammar, an accuracy of 91% is reached (95% correct).

[1]  D. J. Morton Human locomotion and body form : a study of gravity and man , 1952 .

[2]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[3]  A. Huggett A Textbook of Aviation Physiology , 1966 .

[4]  E. P. Eernisse,et al.  Design of Resonant Piezoelectric Devices , 1969 .

[5]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[6]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[7]  R. McNeill Alexander,et al.  Mechanics and energetics of animal locomotion , 1977 .

[8]  R. Shephard,et al.  Human Physiological Work Capacity , 1978 .

[9]  T. Humphries,et al.  A Basic Course in American Sign Language , 1981 .

[10]  U. Bellugi,et al.  Perception of American sign language in dynamic point-light displays. , 1981, Journal of experimental psychology. Human perception and performance.

[11]  Kenneth C. Knowlton,et al.  Perception of sign language from an array of 27 moving spots , 1981, Nature.

[12]  Louis A. Liporace,et al.  Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[13]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[14]  B.-H. Juang,et al.  Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains , 1985, AT&T Technical Journal.

[15]  Michael S. Landy,et al.  Intelligible encoding of ASL image sequences at extremely low information rates , 1985, Comput. Vis. Graph. Image Process..

[16]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[17]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[18]  Kin Hong Wong,et al.  Script recognition using hidden Markov models , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Wilhelm Braune,et al.  The Human Gait , 1987, Springer Berlin Heidelberg.

[20]  Shinichi Tamura,et al.  Recognition of sign language motion images , 1988, Pattern Recognit..

[21]  Paramvir Bahl,et al.  Recognition of handwritten word: First and second order hidden Markov model based approach , 1989, Pattern Recognit..

[22]  G Sperling,et al.  Intelligent temporal subsampling of American Sign Language using event boundaries. , 1990, Journal of experimental psychology. Human perception and performance.

[23]  Yang He,et al.  Planar shape classification using hidden Markov model , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Myron W. Krueger,et al.  Artificial reality II , 1991 .

[25]  Tomoichi Takahashi,et al.  Hand gesture coding based on experiments using a hand gesture interface device , 1991, SGCH.

[26]  Shuji Hashimoto,et al.  A computer music system that follows a human conductor , 1991, Computer.

[27]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[28]  Yang He,et al.  2-D Shape Classification Using Hidden Markov Model , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Richard A. Bolt,et al.  Two-handed gesture in multi-modal natural dialog , 1992, UIST '92.

[30]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  A E Marble,et al.  Image processing system for interpreting motion in American Sign Language. , 1992, Journal of biomedical engineering.

[32]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Yasuhiko Watanabe,et al.  Human reader: An advanced man-machine interface based on human images and speech , 1993, Systems and Computers in Japan.

[34]  Lawrence G. Rubin,et al.  AIP handbook of modern sensors , 1993 .

[35]  George Zavaliagkos,et al.  Comparative Experiments on Large Vocabulary Speech Recognition , 1993, HLT.

[36]  Takeo Kanade,et al.  DigitEyes: Vision-Based Human Hand Tracking , 1993 .

[37]  D. Baudendistel Heart Disease A Textbook of Cardiovascular Medicine , 1993 .

[38]  E. Adelson,et al.  Analyzing gait with spatiotemporal surfaces , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[39]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[40]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[41]  Richard M. Schwartz,et al.  On-line cursive handwriting recognition using speech recognition methods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[42]  N. Rogacheva The Theory of Piezoelectric Shells and Plates , 1994 .

[43]  Irfan Essa,et al.  Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[44]  Lee W. Campbell Recognizing classical ballet steps using plase space constraints , 1994 .

[45]  Ramesh C. Jain,et al.  Recursive identification of gesture inputs using hidden Markov models , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[46]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[47]  A. Pentland,et al.  Attention-driven Expression and Gesture Analysis in an Interactive Environment , 1995 .

[48]  Jeffrey Mark Siskind,et al.  A Maximum-Likelihood Approach to Visual Event Classification , 1996, ECCV.