Sparse representations for hand gesture recognition

Dynamic recognition of gestures from video sequences is a challenging task due to the high variability in the characteristics of each gesture with respect to different individuals. In this work, we propose a novel representation of gestures as linear combinations of the elements of an overcomplete dictionary, based on the emerging theory of sparse representations. We evaluate our approach on a publicly available gesture dataset of Palm Grafti Digits and compare it with other state-of-the-art methods, such as Hidden Markov Models, Dynamic Time Warping and the recently proposed distance metric termed Move-Split-Merge. Our experimental results suggest that the proposed recognition scheme offers high recognition accuracy in isolated gesture recognition and a satisfying robustness to noisy data, thus indicating that sparse representations can be successfully applied in the field of gesture recognition.

[1]  Tuomas Virtanen,et al.  Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Yun Fu,et al.  Sparse Coding on Local Spatial-Temporal Volumes for Human Action Recognition , 2010, ACCV.

[3]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[6]  Ayoub Al-Hamadi,et al.  A Robust Method for Hand Gesture Segmentation and Recognition Using Forward Spotting Scheme in Conditional Random Fields , 2010, 2010 20th International Conference on Pattern Recognition.

[7]  Heung-Il Suk,et al.  Hand gesture recognition based on dynamic Bayesian network framework , 2010, Pattern Recognit..

[8]  Ayoub Al-Hamadi,et al.  Hand Gesture Spotting Based on 3D Dynamic Features Using Hidden Markov Models , 2009, FGIT-SIP.

[9]  Gautam Das,et al.  The Move-Split-Merge Metric for Time Series , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[12]  Stan Sclaroff,et al.  A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Fei-Fei Li,et al.  Online detection of unusual events in videos via dynamic sparse coding , 2011, CVPR 2011.

[14]  Shahrokh Valaee,et al.  Accelerometer-based gesture recognition via dynamic-time warping, affinity propagation, & compressive sensing , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Massimo Piccardi,et al.  Compressive Sensing of Time Series for Human Action Recognition , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[17]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[18]  Jordi Vitrià,et al.  Adaptive Dynamic Space Time Warping for Real Time Sign Language Recognition , 2009 .

[19]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[20]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.