An Improved Hand Gesture Recognition with Two-Stage Convolution Neural Networks Using a Hand Color Image and its Pseudo-Depth Image

Robust hand gesture recognition has been playing a significant role in the field of human-computer interaction for a long time, but it is still full of challenges due to many accept such as cluttered backgrounds and hand self-occlusion. With the help of depth information, depth-based methods have better performance, but the depth cameras are not as widely used and affordable as color cameras. Therefore, in this paper, we propose a two-stage deep convolutional neural network (CNN) architecture for accurate color-based hand gesture recognition. The first stage performs generation of pseudo-depth hand images from color images and the second stage recognizes hand gesture classes using both the color image and its pseudo-depth hand image. The generation stage architecture is based on an image-to-image translation network. In the recognition stage, a two-stream CNN architecture with color image and its pseudo depth image is proposed to improve the color image-based recognition performance. We also propose two strategies in two-stream fusion: feature fusion and committee fusion. To validate our approach, we construct a new dataset called MaHG-RGBD dataset. Experiments demonstrate that our approach significantly improves the performance in RGB-only recognition for hand gestures.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Rafiqul Zaman Khan,et al.  Hand Gesture Recognition: A Literature Review , 2012 .

[3]  Bogdan Kwolek,et al.  Hand Posture Recognition Using Convolutional Neural Network , 2017, CIARP.

[4]  Yimin Liu,et al.  Micro Hand Gesture Recognition System Using Ultrasonic Active Sensing , 2017, IEEE Access.

[5]  Jon Gauthier Conditional generative adversarial nets for convolutional face generation , 2015 .

[6]  Adnan Khashman,et al.  Deep learning in vision-based static hand gesture recognition , 2017, Neural Computing and Applications.

[7]  Chun-Chieh Chiu,et al.  Wearable armband for real time hand gesture recognition , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[8]  J. Kautz,et al.  Hand Gesture Recognition with 3 D Convolutional Neural Networks , 2015 .

[9]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pavlo Molchanov,et al.  Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[12]  Yen-Wei Chen,et al.  MaHG-RGBD: A Multi-angle View Hand Gesture RGB-D Dataset for Deep Learning Based Gesture Recognition and Baseline Evaluations , 2019, 2019 IEEE International Conference on Consumer Electronics (ICCE).

[13]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[14]  Olli Silvén,et al.  OUHANDS database for hand detection and pose recognition , 2016, 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.