Deep Learning Based Hand Gesture Recognition and UAV Flight Controls

Dynamic hand gesture recognition is a desired alternative means for human-computer interactions. This paper presents a hand gesture recognition system that is designed for the control of flights of unmanned aerial vehicles (UAV). A data representation model that represents a dynamic gesture sequence by converting the 4-D spatiotemporal data to 2-D matrix and a 1-D array is introduced. To train the system to recognize designed gestures, skeleton data collected from a Leap Motion Controller are converted to two different data models. As many as 9 124 samples of the training dataset, 1 938 samples of the testing dataset are created to train and test the proposed three deep learning neural networks, which are a 2-layer fully connected neural network, a 5-layer fully connected neural network and an 8-layer convolutional neural network. The static testing results show that the 2-layer fully connected neural network achieves an average accuracy of 96.7% on scaled datasets and 12.3% on non-scaled datasets. The 5-layer fully connected neural network achieves an average accuracy of 98.0% on scaled datasets and 89.1% on non-scaled datasets. The 8-layer convolutional neural network achieves an average accuracy of 89.6% on scaled datasets and 96.9% on non-scaled datasets. Testing on a drone-kit simulator and a real drone shows that this system is feasible for drone flight controls.

[1]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[2]  Aditya Ramamoorthy,et al.  Recognition of dynamic hand gestures , 2003, Pattern Recognit..

[3]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Joze Guna,et al.  An Analysis of the Precision and Reliability of the Leap Motion Sensor and Its Suitability for Static and Dynamic Tracking , 2014, Sensors.

[5]  Christian Wolf,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Changzhi Li,et al.  A Review on Recent Advances in Doppler Radar Sensors for Noncontact Healthcare Monitoring , 2013, IEEE Transactions on Microwave Theory and Techniques.

[7]  Sander Dieleman,et al.  Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video , 2015, International Journal of Computer Vision.

[8]  Yi Cao,et al.  Multi-layer Contribution Propagation Analysis for Fault Diagnosis , 2019, Int. J. Autom. Comput..

[9]  Bhiksha Raj,et al.  Ultrasonic Doppler Sensing in HCI , 2012, IEEE Pervasive Computing.

[10]  Hermann Ney,et al.  Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition , 2016, BMVC.

[11]  Helge J. Ritter,et al.  Visual recognition of continuous hand postures , 2002, IEEE Trans. Neural Networks.

[12]  Nobuyuki Otsu,et al.  Gesture recognition using auto-regressive coefficients of higher-order local auto-correlation features , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[13]  Ruiduo Yang,et al.  Gesture Recognition using Hidden Markov Models from Fragmented Observations , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Archana Ghotkar,et al.  Dynamic Hand Gesture Recognition using Hidden Markov Model by Microsoft Kinect Sensor , 2016 .

[15]  Xilin Chen,et al.  Two streams Recurrent Neural Networks for Large-Scale Continuous Gesture Recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[16]  Pavlo Molchanov,et al.  Hand gesture recognition with 3D convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Yi Li,et al.  Dynamic hand gesture recognition using hidden Markov models , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).

[18]  Oya Aran,et al.  VISION BASED SIGN LANGUAGE RECOGNITION: MODELING AND RECOGNIZING ISOLATED SIGNS WITH MANUAL AND NON-MANUAL COMPONENTS , 2008 .

[19]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Frank Weichert,et al.  Analysis of the Accuracy and Robustness of the Leap Motion Controller , 2013, Sensors.

[22]  Gerhard Rigoll,et al.  High Performance Real-Time Gesture Recognition Using Hidden Markov Models , 1997, Gesture Workshop.

[23]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[24]  Nicolas D. Georganas,et al.  Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques , 2011, IEEE Transactions on Instrumentation and Measurement.

[25]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[26]  Ming C. Leu,et al.  Human-Computer Interaction System with Artificial Neural Network Using Motion Tracker and Data Glove , 2005, PReMI.

[27]  Jiang Long,et al.  Instrument-Based Noncontact Doppler Radar Vital Sign Detection System Using Heterodyne Digital Quadrature Demodulation Architecture , 2010, IEEE Transactions on Instrumentation and Measurement.

[28]  Nicolas Pugeault,et al.  Sign language recognition using sub-units , 2012, J. Mach. Learn. Res..

[29]  Bin Hu,et al.  Deep Learning Based Hand Gesture Recognition and UAV Flight Controls , 2018, International Journal of Automation and Computing.

[30]  Wu-Chih Hu,et al.  Gabor filter-based hand-pose angle estimation for hand gesture recognition under varying illumination , 2011, Expert Syst. Appl..

[31]  Yu Fu,et al.  Gesture Recognition Based on BP Neural Network Improved by Chaotic Genetic Algorithm , 2018, Int. J. Autom. Comput..

[32]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[33]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  Lale Akarun,et al.  Real time gestural interface for generic applications , 2005, 2005 13th European Signal Processing Conference.

[35]  Md. Rezaul Karim,et al.  Deep Learning with TensorFlow , 2017 .