Dynamic gesture recognition using PCA with multiscale theory and HMM

In this paper, a dynamic gesture recognition system is presented which requires no special hardware other than a Webcam. The system is based on a novel method combining Principal Component Analysis (PCA) with hierarchical multi-scale theory and Discrete Hidden Markov Models (DHMM). We use a hierarchical decision tree based on multiscale theory. Firstly we convolve all members of the training data with a Gaussian kernel, which blurs differences between images and reduces their separation in feature space. This reduces the number of eigenvectors needed to describe the data. A principal component space is computed from the convolved data. We divide the data in this space into two clusters using the k-means algorithm. Then the level of blurring is reduced and PCA is applied to each of the clusters separately. A new principal component space is formed from each cluster. Each of these spaces is then divided into two and the process is repeated. We thus produce a binary tree of principal component spaces where each level of the tree represents a different degree of blurring. The search time is then proportional to the depth of the tree, which makes it possible to search hundreds of gestures in real time. The output of the decision tree is then input into DHMM to recognize temporal information.

[1]  Ying Wu,et al.  Capturing articulated human hand motion: a divide-and-conquer approach , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Akira Utsumi,et al.  Direct manipulation interface using multiple cameras for hand gesture recognition , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[4]  David C. Hogg,et al.  3D Deformable Hand Models , 1996, Gesture Workshop.

[5]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[6]  B. Gurumoorthy,et al.  Multiple feature interpretation across domains , 2000 .

[7]  Ying Wu,et al.  View-independent recognition of hand postures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  A. Martinez,et al.  Face image retrieval using HMMs , 1999, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL'99).

[10]  Kazuhiko Yamamoto,et al.  Focus of attention for face and hand gesture recognition using multiple cameras , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[11]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[12]  Juyang Weng,et al.  Hierarchical Discriminant Regression , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  James L. Crowley,et al.  Local appearance space for recognition of navigation landmarks , 2000, Robotics Auton. Syst..