A robust gesture recognition using hand local data and skeleton trajectory

In this paper, we propose a new approach for dynamic hand gesture recognition using intensity, depth and skeleton joint data captured by KinectTM sensor. The proposed approach integrates global and local information of a dynamic gesture. First, we represent the skeleton 3D trajectory in spherical coordinates. Then, we extract the key frames corresponding to the points with more angular and distance difference. In each key frame, we calculate the spherical distance from the hands, wrists and elbows to the shoulder center, also we record the hands position changes to obtain the global information. Finally, we segment the hands and use SIFT descriptor on intensity and depth data. Then, Bag of Visual Words (BOW) approach is used to extract local information. The system was tested with the ChaLearn 2013 gesture dataset and our own Brazilian Sign Language dataset, achieving an accuracy of 88.39% and 98.28%, respectively.

[1]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[2]  Jason Jianjun Gu,et al.  Combining features for Chinese sign language recognition with Kinect , 2014, 11th IEEE International Conference on Control & Automation (ICCA).

[3]  Sergio Escalera,et al.  Multi-modal gesture recognition challenge 2013: dataset and results , 2013, ICMI '13.

[4]  R. Harikrishnan,et al.  A vision based dynamic gesture recognition of Indian Sign Language on Kinect based depth images , 2013, 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA).

[5]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[6]  Pabitra Mitra,et al.  A survey on image retrieval performance of different bag of visual words indexing techniques , 2014, Proceedings of the 2014 IEEE Students' Technology Symposium.

[7]  Mohamad Ivan Fanany,et al.  Constructive, robust and adaptive OS-ELM in human action recognition , 2014, 2014 International Conference on Industrial Automation, Information and Communications Technology.

[8]  R. S. Jadon,et al.  A REVIEW OF VISION BASED HAND GESTURES RECOGNITION , 2009 .

[9]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[10]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[11]  Erdefi Rakun,et al.  Combining depth image and skeleton data from Kinect for recognizing words in the sign system for Indonesian language (SIBI [Sistem Isyarat Bahasa Indonesia]) , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[12]  Hanqing Lu,et al.  Fusing multi-modal features for gesture recognition , 2013, ICMI '13.

[13]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[14]  Hironori Takimoto,et al.  A Robust Gesture Recognition Using Depth Data , 2013 .

[15]  Sergio Escalera,et al.  Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D , 2014, Pattern Recognit. Lett..

[16]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  Markus Koskela,et al.  Using Appearance-Based Hand Features for Dynamic RGB-D Gesture Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[18]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.