Hand Gesture Recognition in Video Sequences Using Deep Convolutional and Recurrent Neural Networks

Abstract Deep learning is a new branch of machine learning, which is widely used by researchers in a lot of artificial intelligence applications, including signal processing and computer vision. The present research investigates the use of deep learning to solve the hand gesture recognition (HGR) problem and proposes two models using deep learning architecture. The first model comprises a convolutional neural network (CNN) and a recurrent neural network with a long short-term memory (RNN-LSTM). The accuracy of model achieves up to 82 % when fed by colour channel, and 89 % when fed by depth channel. The second model comprises two parallel convolutional neural networks, which are merged by a merge layer, and a recurrent neural network with a long short-term memory fed by RGB-D. The accuracy of the latest model achieves up to 93 %.

[1]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Zhongyuan Lai,et al.  Fingertips detection and hand gesture recognition based on discrete curve evolution with a kinect sensor , 2016, 2016 Visual Communications and Image Processing (VCIP).

[3]  Margrit Betke,et al.  Personalizing Gesture Recognition Using Hierarchical Bayesian Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Oscar Koller,et al.  Using Convolutional 3D Neural Networks for User-independent continuous gesture recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[5]  Haiying Guan,et al.  Model-based 3D hand posture estimation from a single 2D image , 2002, Image Vis. Comput..

[6]  Margrit Betke,et al.  A random forest approach to segmenting and classifying gestures , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[7]  Vijay John,et al.  Deep Learning-Based Fast Hand Gesture Recognition Using Representative Frames , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[8]  Pavlo Molchanov,et al.  Hand gesture recognition with 3D convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Jieyu Zhao,et al.  A hand gesture recognition system based on canonical superpixel-graph , 2017, Signal Process. Image Commun..

[10]  Miao Ma,et al.  A Recognition Method of Hand Gesture Based on Stacked Denoising Autoencoder , 2018 .

[11]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[12]  Hee-Deok Yang,et al.  Sign Language Recognition with the Kinect Sensor Based on Conditional Random Fields , 2014, Sensors.

[13]  Quang Nguyen,et al.  Human Computer Interaction Using Hand Gestures , 2014, ICIC.

[14]  Mohan M. Trivedi,et al.  Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations , 2014, IEEE Transactions on Intelligent Transportation Systems.

[15]  Svetlana N. Yanushkevich,et al.  CNN+RNN Depth and Skeleton based Dynamic Hand Gesture Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[16]  Archana Ghotkar,et al.  Dynamic Hand Gesture Recognition using Hidden Markov Model by Microsoft Kinect Sensor , 2016 .

[17]  Luc Van Gool,et al.  Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Prashan Premaratne,et al.  Historical Development of Hand Gesture Recognition , 2014 .