Realtime Human-UAV Interaction Using Deep Learning

In this paper, we propose a realtime human gesture identification for controlling a micro UAV in a GPS denied environment. Exploiting the breakthrough of deep convolution network in computer vision, we develop a robust Human-UAV Interaction (HUI) system that can detect and identify a person gesture to control a micro UAV in real time. We also build a new dataset with 23 participants to train or fine-tune the deep neural networks for human gesture detection. Based on the collected dataset, the state-of-art YOLOv2 detection network is tailored to detect the face and two hands locations of a human. Then, an interpreter approach is proposed to infer the gesture from detection results, in which each interpreted gesture is equivalent to a UAV flying command. Real flight experiments performed by non-expert users with the Bebop 2 micro UAV have approved our proposal for HUI. The gesture detection deep model with a demo will be publicly available to aid the research work.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Federico Manuri,et al.  A Kinect-based natural interface for quadrotor control , 2011, Entertain. Comput..

[3]  Ling Shao,et al.  Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier , 2017, IEEE Transactions on Image Processing.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Hong Bo Zhou,et al.  Automatic Method for Determining Cluster Number Based on Silhouette Coefficient , 2014 .

[6]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[7]  Hassan Noura,et al.  A gesture based kinect for quadrotor control , 2015, 2015 International Conference on Information and Communication Technology Research (ICTRC).

[8]  Baochang Zhang,et al.  Adaptive Local Movement Modeling for Robust Object Tracking , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Luca Maria Gambardella,et al.  Human Control of UAVs using Face Pose Estimates and Hand Gestures , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  Martin Molina,et al.  Natural user interfaces for human-drone multi-modal interaction , 2016, 2016 International Conference on Unmanned Aircraft Systems (ICUAS).

[11]  Federico Manuri,et al.  A Kinect-Based Natural Interface for Quadrotor Control , 2011, INTETAIN.

[12]  Rongrong Ji,et al.  Bounding Multiple Gaussians Uncertainty with Application to Object Tracking , 2016, International Journal of Computer Vision.