Dynamic hand gesture recognition: An exemplar-based approach from motion divergence fields

Exemplar-based approaches for dynamic hand gesture recognition usually require a large collection of gestures to achieve high-quality performance. Efficient visual representation of the motion patterns hence is very important to offer a scalable solution for gesture recognition when the databases are large. In this paper, we propose a new visual representation for hand motions based on the motion divergence fields, which can be normalized to gray-scale images. Salient regions such as Maximum Stable Extremal Regions (MSER) are then detected on the motion divergence maps. From each detected region, a local descriptor is extracted to capture local motion patterns. We further leverage indexing techniques from image search into gesture recognition. The extracted descriptors are indexed using a pre-trained vocabulary. A new gesture sample accordingly can be efficiently matched with database gestures through a term frequency-inverse document frequency (TF-IDF) weighting scheme. We have collected a hand gesture database with 10 categories and 1050 video samples for performance evaluation and further applications. The proposed method achieves higher recognition accuracy than other state-of-the-art motion and spatio-temporal features on this database. Besides, the average recognition time of our method for each gesture sequence is only 34.53ms.

[1]  Todd Ingalls,et al.  Real-time Gesture Recognition with Minimal Training Requirements and On-line Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[3]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  David G. Lowe,et al.  Shape Descriptors for Maximally Stable Extremal Regions , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  David Nistér,et al.  Linear Time Maximally Stable Extremal Regions , 2008, ECCV.

[8]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[10]  Thomas S. Huang,et al.  Gesture modeling and recognition using finite state machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[11]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[13]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Gang Hua,et al.  Motion divergence fields for dynamic hand gesture recognition , 2011, Face and Gesture 2011.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[18]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Mubarak Shah,et al.  Recognizing Hand Gestures , 1994, ECCV.

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[24]  Ian D. Reid,et al.  A Probabilistic Framework for Recognizing Similar Actions using Spatio-Temporal Features , 2007, BMVC.

[25]  Seong-Whan Lee,et al.  Recognizing hand gestures using dynamic Bayesian network , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[26]  Tae-Kyun Kim,et al.  Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests , 2010, BMVC.

[27]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[28]  Andrew W. Fitzgibbon,et al.  Real-time gesture recognition using deterministic boosting , 2002, BMVC.

[29]  Sébastien Marcel,et al.  Hand gesture recognition using input-output hidden Markov models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[30]  Larry S. Davis,et al.  Learning dynamics for exemplar-based gesture recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[31]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[33]  Chung-Lin Huang,et al.  Hand gesture recognition using a real-time tracking method and hidden Markov models , 2003, Image Vis. Comput..

[34]  Kosuke Sato,et al.  Real-time gesture recognition by learning and selective control of visual interest points , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.