Gesture Recognition from One Example Using Depth Images

—By using depth images, this paper presents an approach capable of recognizing the gesture from only one example of each class. Background removal and denoising are performed on depth images firstly. Motion Energy Information (MEI) images are then obtained through calculating the differences between consecutive frames. Within each MEI image, we represent successive movements by time series using Histograms of Oriented Gradients (HOG) descriptor. Principle Component Analysis (PCA) reconstruction approach is applied on the descriptor to find a set of discriminantly informative principle components (PCs) from the corresponding training gesture. Next the descriptors extracted from test gestures are reconstructed back utilizing each set of PCs from training gestures. Finally the test gestures are recognized according to the set of PCs which produces the lowest reconstruction error. We evaluate our approach on the task of recognizing gestures from one example using depth images, and compare the performance of our approach with other methods, reaching a promising result.

[1]  David M. J. Tax,et al.  One-class classification , 2001 .

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[4]  I. Guyon,et al.  Principal motion: PCA-based reconstruction of motion histograms , 2012 .

[5]  J. Ross Beveridge,et al.  Tangent bundle for human action recognition , 2011, Face and Gesture 2011.

[6]  Juan Carlos Gomez,et al.  PCA document reconstruction for email classification , 2012, Comput. Stat. Data Anal..

[7]  Sangyoun Lee,et al.  3D hand tracking using Kalman filter in depth space , 2012, EURASIP J. Adv. Signal Process..

[8]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[10]  J. Aggarwal,et al.  Recognizing human action from a far field of view , 2009, 2009 Workshop on Motion and Video Computing (WMVC).

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Martial Hebert,et al.  Modeling the Temporal Extent of Actions , 2010, ECCV.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Ling Shao,et al.  One shot learning gesture recognition from RGBD images , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[20]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[21]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[22]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.