MFA-Net: Motion Feature Augmented Network for Dynamic Hand Gesture Recognition from Skeletal Data †

Dynamic hand gesture recognition has attracted increasing attention because of its importance for human–computer interaction. In this paper, we propose a novel motion feature augmented network (MFA-Net) for dynamic hand gesture recognition from skeletal data. MFA-Net exploits motion features of finger and global movements to augment features of deep network for gesture recognition. To describe finger articulated movements, finger motion features are extracted from the hand skeleton sequence via a variational autoencoder. Global motion features are utilized to represent the global movements of hand skeleton. These motion features along with the skeleton sequence are then fed into three branches of a recurrent neural network (RNN), which augment the motion features for RNN and improve the classification performance. The proposed MFA-Net is evaluated on two challenging skeleton-based dynamic hand gesture datasets, including DHG-14/28 dataset and SHREC’17 dataset. Experimental results demonstrate that our proposed method achieves comparable performance on DHG-14/28 dataset and better performance on SHREC’17 dataset when compared with start-of-the-art methods.

[1]  Chin-Boon Chng,et al.  Hand gesture guided robot-assisted surgery based on a direct augmented reality interface , 2014, Comput. Methods Programs Biomed..

[2]  Stephen J. McKenna,et al.  Structure Prediction for Gland Segmentation With Hand-Crafted and Deep Convolutional Features , 2018, IEEE Transactions on Medical Imaging.

[3]  Sergio Escalera,et al.  Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey , 2017, Gesture Recognition.

[4]  Mohan M. Trivedi,et al.  Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations , 2014, IEEE Transactions on Intelligent Transportation Systems.

[5]  Pichao Wang,et al.  Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks , 2016, ACM Multimedia.

[6]  Xuan Wang,et al.  GesID: 3D Gesture Authentication Based on Depth Camera and One-Class Classification , 2018, Sensors.

[7]  Anders Grunnet-Jepsen,et al.  Intel RealSense Stereoscopic Depth Cameras , 2017, CVPR 2017.

[8]  Gerhard Rigoll,et al.  Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Binh P. Nguyen,et al.  Robust Biometric Recognition From Palm Depth Images for Gloved Hands , 2015, IEEE Transactions on Human-Machine Systems.

[10]  Wei Li,et al.  One-shot learning gesture recognition from RGB-D data using bag of features , 2013, J. Mach. Learn. Res..

[11]  Guijin Wang,et al.  A hand gesture based interactive presentation system utilizing heterogeneous cameras , 2012 .

[12]  Christian Wolf,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Bo Liu,et al.  Static hand gesture recognition based on finger root-center-angle and length weighted Mahalanobis distance , 2016, Photonics Europe.

[14]  Guijin Wang,et al.  Towards Good Practices for Deep 3D Hand Pose Estimation , 2017, ArXiv.

[15]  Franck Multon,et al.  HIF3D: Handwriting-Inspired Features for 3D skeleton-based action recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[16]  Tae-Kyun Kim,et al.  Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Qi Ye,et al.  Spatial Attention Deep Net with Partial PSO for Hierarchical Hybrid Hand Pose Estimation , 2016, ECCV.

[18]  Chin-Boon Chng,et al.  In situ spatial AR surgical planning using projector-Kinect system , 2013, SoICT.

[19]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[20]  Andrea Giachetti,et al.  Comparing 3D trajectories for simple mid-air gesture recognition , 2018, Comput. Graph..

[21]  Hazem Wannous,et al.  Skeleton-Based Dynamic Hand Gesture Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Vincent Lepetit,et al.  Training a Feedback Loop for Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Chong Wang,et al.  Superpixel-Based Hand Gesture Recognition With Kinect Depth Camera , 2015, IEEE Transactions on Multimedia.

[24]  Hyo-Rim Choi,et al.  Combined Dynamic Time Warping with Multiple Sensors for 3D Gesture Recognition , 2017, Sensors.

[25]  Hazem Wannous,et al.  3D Hand Gesture Recognition by Analysing Set-of-Joints Trajectories , 2016, UHA3DS@ICPR.

[26]  Shuang Wang,et al.  Spatially and Temporally Structured Global to Local Aggregation of Dynamic Depth Information for Action Recognition , 2018, IEEE Access.

[27]  Wei Chen,et al.  From Signal to Image: Enabling Fine-Grained Gesture Recognition with Commercial Wi-Fi Devices , 2018, Sensors.

[28]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[29]  Juan Song,et al.  Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM , 2017, IEEE Access.

[30]  Bharti Bansal,et al.  Gesture Recognition: A Survey , 2016 .

[31]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Carolina Wählby,et al.  Feature Augmented Deep Neural Networks for Segmentation of Cells , 2016, ECCV Workshops.

[33]  Tae-Kyun Kim,et al.  SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds , 2018, IEEE Access.

[34]  Daniel Thalmann,et al.  Parsing the Hand in Depth Images , 2014, IEEE Transactions on Multimedia.

[35]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Data, Methods, and Challenges , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Juan José Pantrigo,et al.  Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition , 2018, Pattern Recognit..

[37]  Pichao Wang,et al.  Combining ConvNets with hand-crafted features for action recognition based on an HMM-SVM classifier , 2017, Multimedia Tools and Applications.

[38]  Brendan O'Flynn,et al.  Hand Tracking and Gesture Recognition Using Lensless Smart Sensors , 2018, Sensors.

[39]  Pavlo Molchanov,et al.  Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Franck Multon,et al.  Dynamic hand gesture recognition based on 3D pattern assembled trajectories , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[41]  Guijin Wang,et al.  Pose Guided Structured Region Ensemble Network for Cascaded Hand Pose Estimation , 2017, Neurocomputing.

[42]  Wei Li,et al.  3D SMoSIFT: three-dimensional sparse motion scale invariant feature transform for activity recognition from RGB-D videos , 2014, J. Electronic Imaging.

[43]  Junsong Yuan,et al.  Robust Part-Based Hand Gesture Recognition Using Kinect Sensor , 2013, IEEE Transactions on Multimedia.

[44]  Svetlana N. Yanushkevich,et al.  CNN+RNN Depth and Skeleton based Dynamic Hand Gesture Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[45]  Jun Wan,et al.  Explore Efficient Local Features from RGB-D Data for One-Shot Learning Gesture Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Hermann Ney,et al.  Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data is Continuous and Weakly Labelled , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Mohan M. Trivedi,et al.  Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[48]  David Filliat,et al.  3D Hand Gesture Recognition Using a Depth and Skeletal Dataset , 2017, 3DOR@Eurographics.

[49]  Sergio Escalera,et al.  RGB-D-based Human Motion Recognition with Deep Learning: A Survey , 2017, Comput. Vis. Image Underst..

[50]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Guijin Wang,et al.  High-Accuracy Stereo Matching Based on Adaptive Ground Control Points , 2015, IEEE Transactions on Image Processing.

[52]  Fei Qiao,et al.  Region ensemble network: Improving convolutional network for hand pose estimation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[53]  Lu Yang,et al.  Survey on 3D Hand Gesture Recognition , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Guijin Wang,et al.  A novel hierarchical framework for human action recognition , 2016, Pattern Recognit..

[55]  Sergio Escalera,et al.  ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Chenbo Shi,et al.  Depth estimation for speckle projection system using progressive reliable points growing matching. , 2013, Applied optics.

[57]  Juan Song,et al.  Learning Spatiotemporal Features Using 3DCNN and Convolutional LSTM for Gesture Recognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[58]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[59]  Guijin Wang,et al.  Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[60]  Michel F. Valstar,et al.  Fusing Deep Learned and Hand-Crafted Features of Appearance, Shape, and Dynamics for Automatic Pain Estimation , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[61]  Zaid Omar,et al.  A review of hand gesture and sign language recognition techniques , 2017, International Journal of Machine Learning and Cybernetics.

[62]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[63]  Ge Chen,et al.  Hand joints-based gesture recognition for noisy dataset using nested interval unscented Kalman filter with LSTM network , 2018, The Visual Computer.

[64]  Carlos Sagüés,et al.  Human-Computer Interaction Based on Hand Gestures Using RGB-D Sensors , 2013, Sensors.

[65]  M. Nowicki,et al.  FOR LEAP MOTION CONTROLLER , 2014 .