Dynamic Hand Gesture Recognition Based on 3D Convolutional Neural Network Models

Hand gesture is a natural communication method which could be used to create a more convenient interface for human-robot interaction. In this study, we use the simplest laptop camera as an input sensor. We designed a 3D hand gesture recognition model. The model is trained with the Jester dataset. After being trained about one day in a MacBook Pro (i5 2.3GHz), the model reached an average accuracy of 90%. We built a web application that implements the hand gesture recognition system and provides the recognition service to users.

[1]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[2]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Hoi-Jun Yoo,et al.  Low-Power Convolutional Neural Network Processor for a Face-Recognition System , 2017, IEEE Micro.

[4]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Karl F. MacDorman,et al.  Review of constraints on vision-based gesture recognition for human-computer interaction , 2018, IET Comput. Vis..

[10]  David Gregg,et al.  Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks , 2016, IEEE Computer Architecture Letters.

[11]  Warren J. Gross,et al.  An Architecture to Accelerate Convolution in Deep Neural Networks , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[12]  David Jones,et al.  Discerning structure from freeform handwritten notes , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[13]  Ravindra Sor,et al.  A Review on Hand Gesture Recognition System , 2015, 2015 International Conference on Computing Communication Control and Automation.

[14]  Helman Stern,et al.  Sensors for Gesture Recognition Systems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.