A Real-time Multimodal Hand Gesture Recognition via 3D Convolutional Neural Network and Key Frame Extraction

In this paper, we present an approach for hand gesture recognition by 3D Convolutional Neural Network 3D_CNN and key frames extractor algorithm by the fast neural network. Typically, 3D_CNN algorithms classify hand gestures from a number of randomly sampled image sequences. In this work, key frames extracted from static video summarization based clustering method was used as input of 3D_CNN algorithm to improve classification accuracy. Because of expensive computation of video summarization, we propose a fast deep neural network SegNet based video summarization method VSUMM to learn and identify key frames in a video sequence to speed up the computational time to appropriate with a real-time system. We evaluate our proposed algorithms on publish Cambridge gestures dataset and Seven Hand Gestures SHG Dataset. We also experimentally estimate number of key frames per a video sequence in those datasets. This algorithm achieved 94.4% classification accuracy on the Cambridge gestures dataset and 77.71% classification accuracy on the SHG Dataset. The experiment results show that the proposed approach is efficient and outperforms to compare with state-of-art related researches with real-time computational efficiency.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Arnaldo de Albuquerque Araújo,et al.  VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method , 2011, Pattern Recognit. Lett..

[3]  Md. Hafizur Rahman,et al.  Hand Gesture Recognition using Multiclass Support Vector Machine , 2013 .

[4]  Zhou Wang,et al.  Video saliency incorporating spatiotemporal cues and uncertainty weighting , 2013, ICME.

[5]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[7]  Chang-Su Kim,et al.  Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart , 2015, IEEE Transactions on Image Processing.

[8]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Chung-Lin Huang,et al.  Hand gesture recognition using a real-time tracking method and hidden Markov models , 2003, Image Vis. Comput..

[10]  Sergio Escalera,et al.  A Survey on Deep Learning Based Approaches for Action and Gesture Recognition in Image Sequences , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[11]  Ao Tang,et al.  A Real-Time Hand Posture Recognition System Using Deep Neural Networks , 2015, ACM Trans. Intell. Syst. Technol..

[12]  Luca Maria Gambardella,et al.  Max-pooling convolutional neural networks for vision-based hand gesture recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Hazem Wannous,et al.  Skeleton-Based Dynamic Hand Gesture Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Vijay John,et al.  Deep Learning-Based Fast Hand Gesture Recognition Using Representative Frames , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[16]  Reza Fuad Rachmadi,et al.  Video classification using compacted dataset based on selected keyframe , 2016, 2016 IEEE Region 10 Conference (TENCON).

[17]  Juan Song,et al.  Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM , 2017, IEEE Access.

[18]  Ling Li,et al.  Visual hand gesture recognition with convolution neural network , 2016, 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

[19]  Yoshihiko Mochizuki,et al.  A HOG-based hand gesture recognition system on a mobile device , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[20]  Jorge Lobo,et al.  Hand Gesture Recognition Using Color and Depth Images Enhanced with Hand Angular Pose Data * , 2022 .

[21]  Clément Gosselin,et al.  Transfer learning for sEMG hand gestures recognition using convolutional neural networks , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[22]  Pichao Wang,et al.  Large-scale Isolated Gesture Recognition using Convolutional Neural Networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).