Gesture Recognition and Localization Using Convolutional Neural Network

Gesture recognition based on computer vision, which is natural and intuitive, is getting more and more attention in the field of human-computer interaction. Due to the limitations of existing gesture recognition methods, this paper presents an efficient and effective solution, which is divided into two main parts: dataset collection and the convolutional neural network (CNN) design. Firstly, a novel method is proposed to collect the hand gesture dataset through the sliding window, which makes it easy to capture arbitrary gesture data in any background, and the whole process is simple and fast. Secondly, a novel CNN named HandNet is designed for improving the performance of the gesture recognition and localization. The bone structure of the proposed CNN is HandBlock, which has the ability of richer feature expression and adaptive weight expression between high-level features and low-level features. The experimental results show that the proposed model achieves comparable performance with the state-of-the-art methods.

[1]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[2]  Fei Wang,et al.  Real-time Facial Expression Recognition on Robot for Healthcare , 2018, 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR).

[3]  史东承 Shi Dong-cheng,et al.  Background modeling based on YCbCr color space and gesture shadow elimination , 2015 .

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[6]  Jianning Liang,et al.  A Simple and Effective Method for Hand Gesture Recognition , 2016, 2016 International Conference on Network and Information Systems for Computers (ICNISC).

[7]  Gao Zhe Hand Gesture Recognition Using Multiple Spatial Features Fusion , 2016 .

[8]  Sun Xiyana Research on Face Detection Algorithm Based on Skin Model , 2012 .

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jun-Cheol Park,et al.  A Real-time Facial Expression Recognizer using Deep Neural Network , 2016, IMCOM.

[11]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[12]  Yangmin Li,et al.  Static Hand Gesture Recognition with Parallel CNNs for Space Human-Robot Interaction , 2017, ICIRA.

[13]  Slavomír Kajan,et al.  Hand gesture recognition using 3D sensors , 2017, 2017 International Symposium ELMAR.

[14]  Fei Chen,et al.  A Natural Visible and Infrared Facial Expression Database for Expression Recognition and Emotion Inference , 2010, IEEE Transactions on Multimedia.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Fathi M. Salem,et al.  Gate-variants of Gated Recurrent Unit (GRU) neural networks , 2017, 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS).

[17]  Tin Myint Naing,et al.  Real-Time Hand Pose Recognition Using Faster Region-Based Convolutional Neural Network , 2018 .