Visual understanding of dynamic hand gestures

Abstract Analysis of a dynamic hand gesture requires processing a spatio-temporal image sequence. The actual length of the sequence varies with each instantiation of the gesture. The key idea behind solving the problem is to translate the richness of the human gestural communication power to a machine for a better man–machine interaction. We propose a novel vision-based system for automatic interpretation of a limited set of dynamic hand gestures. This involves extracting the temporal signature of the hand motion from the performed gesture. The concept of motion energy is used to estimate the dominant motion from an image sequence. To achieve the desired result, we introduce the concept of modeling the dynamic hand gesture using a finite state machine. The temporal signature is subsequently analyzed by the finite state machine to interpret automatically the performed gesture.

[1]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  N. Nandhakumar,et al.  A simple scheme for motion boundary detection , 1994, Proceedings of IEEE International Conference on Systems, Man and Cybernetics.

[3]  Michel Beaudouin-Lafon,et al.  Charade: remote control of objects using free-hand gestures , 1993, CACM.

[4]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Other Conferences.

[5]  Reinhard Koch,et al.  Dynamic 3-D Scene Analysis Through Synthesis Feedback Control , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Michael J. Swain,et al.  Real-time Gesture Recognition with the Perseus System , 1996 .

[7]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Timothy F. Cootes,et al.  Tracking and Recognising Hand Gestures using Statistical Shape Models , 1995, BMVC.

[9]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  J. O'Rourke,et al.  Model-based image analysis of human motion using constraint propagation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Edward Hunter,et al.  Vision based hand gesture interpretation using recursive estimation , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[12]  Neil Gershenfeld,et al.  MIT-Media Lab , 1991, ICMC.

[13]  Kazuo Kyuma,et al.  Computer vision for computer games , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[14]  Timothy F. Cootes,et al.  Tracking and recognising hand gestures, using statistical shape models , 1997, Image Vis. Comput..

[15]  Roberto Cipolla,et al.  Robust structure from motion using motion parallax , 1993, 1993 (4th) International Conference on Computer Vision.

[16]  Mohammed Yeasin,et al.  Automatic generation of robot program code: learning from perceptual data , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[17]  Daniel E. Koditschek,et al.  Dynamical system representation, generation, and recognition of basic oscillatory motion gestures , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  S. Chaudhuri,et al.  Automatic robot programming by visual demonstration of task execution , 1997, 1997 8th International Conference on Advanced Robotics. Proceedings. ICAR'97.

[19]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[20]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[21]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[22]  Francis K. H. Quek,et al.  Toward a vision-based hand gesture interface , 1994 .

[23]  William T. Freeman,et al.  Television control by hand gestures , 1994 .

[24]  Geoffrey E. Hinton,et al.  Glove-Talk: a neural network interface between a data-glove and a speech synthesizer , 1993, IEEE Trans. Neural Networks.

[25]  Justine Cassell,et al.  Recovering the temporal structure of natural gesture , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[26]  Aaron F. Bobick,et al.  A state-based technique for the summarization and recognition of gesture , 1995, Proceedings of IEEE International Conference on Computer Vision.

[27]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[28]  Alex Pentland,et al.  Facial expression recognition using a dynamic model and motion energy , 1995, Proceedings of IEEE International Conference on Computer Vision.

[29]  Mubarak Shah,et al.  Visual gesture recognition , 1994 .

[30]  Alex Pentland,et al.  A vision system for observing and extracting facial action parameters , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Katsushi Ikeuchi,et al.  Toward automatic robot instruction from perception-temporal segmentation of tasks from human hand motion , 1993, IEEE Trans. Robotics Autom..

[32]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[33]  David Zeltzer,et al.  A survey of glove-based input , 1994, IEEE Computer Graphics and Applications.

[34]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[35]  Francis K. H. Quek,et al.  Inductive learning in hand pose recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[36]  D.J. Anderson,et al.  Optimal Estimation of Contour Properties by Cross-Validated Regularization , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  R. S. Jasinchi Intrinsic constraints in space-time filtering: a new approach to representing uncertainty in low-level vision , 1992 .

[38]  Ioannis A. Kakadiaris,et al.  Active part-decomposition, shape and motion estimation of articulated objects: a physics-based approach , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Aaron F. Bobick,et al.  Learning visual behavior for gesture analysis , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[40]  Christopher R. Wren,et al.  Real-Time 3-D Tracking of the Human Body , 1996 .

[41]  Ryuichi Oka,et al.  A theoretical consideration of pattern space trajectory for gesture spotting recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.