A New Approach for Dynamic Gesture Recognition Using Skeleton Trajectory Representation and Histograms of Cumulative Magnitudes

In this paper, we present a new approach for dynamic hand gesture recognition that uses intensity, depth, and skeleton joint data captured by Kinect sensor. This method integrates global and local information of a dynamic gesture. First, we represent the skeleton 3D trajectory in spherical coordinates. Then, we select the most relevant points in the hand trajectory with our proposed method for keyframe detection. After, we represent the joint movements by spatial, temporal and hand position changes information. Next, we use the direction cosines definition to describe the body positions by generating histograms of cumulative magnitudes from the depth data which were converted in a point-cloud. We evaluate our approach with different public gesture datasets and a sign language dataset created by us. Our results outperformed state-of-the-art methods and highlight the smooth and fast processing for feature extraction being able to be implemented in real time.

[1]  R. Ciupa,et al.  International Conference , 2023, In Vitro Cellular & Developmental Biology - Animal.

[2]  Edwin Escobedo,et al.  Finger Spelling Recognition from Depth data using Direction Cosines and Histogram of Cumulative Magnitudes , 2015 .

[3]  Tarik Arici,et al.  Gesture Recognition using Skeleton Data with Weighted Dynamic Time Warping , 2013, VISAPP.

[4]  Nasser Kehtarnavaz,et al.  UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[5]  Jason Jianjun Gu,et al.  Combining features for Chinese sign language recognition with Kinect , 2014, 11th IEEE International Conference on Control & Automation (ICCA).

[6]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[7]  E. Escobedo-Cardenas,et al.  A robust gesture recognition using hand local data and skeleton trajectory , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[8]  Erdefi Rakun,et al.  Combining depth image and skeleton data from Kinect for recognizing words in the sign system for Indonesian language (SIBI [Sistem Isyarat Bahasa Indonesia]) , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[11]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[12]  Helena M. Mentis,et al.  Instructing people for training gestural interactive systems , 2012, CHI.

[13]  Hironori Takimoto,et al.  A Robust Gesture Recognition Using Depth Data , 2013 .

[14]  Aytül Erçil,et al.  A Decision Forest Based Feature Selection Framework for Action Recognition from RGB-Depth Cameras , 2013, ICIAR.

[15]  Aytül Erçil,et al.  A decision forest based feature selection framework for action recognition from RGB-depth cameras , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).

[16]  Sergio Escalera,et al.  Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D , 2014, Pattern Recognit. Lett..

[17]  Ling Shao,et al.  Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[19]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[20]  R. S. Jadon,et al.  A REVIEW OF VISION BASED HAND GESTURES RECOGNITION , 2009 .

[21]  Mohamad Ivan Fanany,et al.  Constructive, robust and adaptive OS-ELM in human action recognition , 2014, 2014 International Conference on Industrial Automation, Information and Communications Technology.

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  R. Harikrishnan,et al.  A vision based dynamic gesture recognition of Indian Sign Language on Kinect based depth images , 2013, 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA).

[24]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[25]  Markus Koskela,et al.  Using Appearance-Based Hand Features for Dynamic RGB-D Gesture Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[26]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[27]  Youfu Li,et al.  A new descriptor for multiple 3D motion trajectories recognition , 2013, 2013 IEEE International Conference on Robotics and Automation.