Angle based hand gesture recognition using graph convolutional network

Hand gesture recognition has attracted huge interest in the areas of autonomous driving, human computer systems, gaming and many others. Skeleton based techniques along with graph convolutional networks (GCNs) are being popularly used in this field due to the easy estimation of joint coordinates and better representation capability of graphs. Simple hand skeleton graphs are unable to capture the finer details and complex spatial features of hand gestures. To address these challenges, this work proposes an “angle‐based hand gesture graph convolutional network” (AHG‐GCN). This model introduces two additional types of novel edges in the graph to connect the wrist with each fingertip and finger's base, explicitly capturing their relationship, which plays an important role in differentiating gestures. Besides, novel features for each skeleton joint are designed using the angles formed with fingertip/finger‐base joints and the distance among them to extract semantic correlation and tackle the overfitting problem. Thus, an enhanced set of 25 features for each joint is obtained using these novel techniques. The proposed model achieves 90% and 88% accuracy for 14 and 28 gesture configurations for the DHG 14/28 dataset and, 94.05% and 89.4% accuracy for 14 and 28 gesture configurations for the SHREC 2017 dataset, respectively.

[1]  Yikang Yang,et al.  Performance Comparison of Gesture Recognition System Based on Different Classifiers , 2021, IEEE Transactions on Cognitive and Developmental Systems.

[2]  Ge Chen,et al.  Skeleton-Based Dynamic Hand Gesture Recognition Using an Enhanced Network with One-Shot Learning , 2020, Applied Sciences.

[3]  Yang Liu,et al.  A Rapid Spiking Neural Network Approach With an Application on Hand Gesture Recognition , 2019, IEEE Transactions on Cognitive and Developmental Systems.

[4]  Rongrong Ji,et al.  Deep Manifold Structure Transfer for Action Recognition , 2019, IEEE Transactions on Image Processing.

[5]  Jingyuan Yin,et al.  Image segmentation based on fuzzy clustering with cellular automata and features weighting , 2019, EURASIP J. Image Video Process..

[6]  Jiaying Liu,et al.  Optimized Skeleton-based Action Recognition via Sparsified Graph Regression , 2018, ACM Multimedia.

[7]  Juan José Pantrigo,et al.  Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition , 2018, Pattern Recognit..

[8]  Luigi Cinque,et al.  Exploiting Recurrent Neural Networks and Leap Motion Controller for the Recognition of Sign Language and Semaphoric Hand Gestures , 2018, IEEE Transactions on Multimedia.

[9]  Jian Yang,et al.  Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition , 2018, AAAI.

[10]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[11]  Hong Liu,et al.  Enhanced skeleton visualization for view invariant human action recognition , 2017, Pattern Recognit..

[12]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[13]  Xiaohui Xie,et al.  Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks , 2016, AAAI.

[14]  Fei Han,et al.  Space-Time Representation of People Based on 3D Skeletal Data: A Review , 2016, Comput. Vis. Image Underst..

[15]  Graham W. Taylor,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mohan M. Trivedi,et al.  Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations , 2014, IEEE Transactions on Intelligent Transportation Systems.

[17]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[18]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[19]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[20]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  F. Althoff,et al.  ROBUST MULTIMODAL HAND-AND HEAD GESTURE RECOGNITION FOR CONTROLLING AUTOMOTIVE INFOTAINMENT SYSTEMS , 2005 .

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.