Learning Pose Estimation for UAV Autonomous Navigation and Landing Using Visual-Inertial Sensor Data

In this work, we propose a robust network-in-the-loop control system for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). To estimate the UAV’s absolute pose, we develop a deep neural network (DNN) architecture for visual-inertial odometry, which provides a robust alternative to traditional methods. We first evaluate the accuracy of the estimation by comparing the prediction of our model to traditional visual-inertial approaches on the publicly available EuRoC MAV dataset. The results indicate a clear improvement in the accuracy of the pose estimation up to 25% over the baseline. Finally, we integrate the data-driven estimator in the closed-loop flight control system of Airsim, a simulator available as a plugin for Unreal Engine, and we provide simulation results for autonomous navigation and landing.

[1]  S. Shankar Sastry,et al.  LANDING AN UNMANNED AIR VEHICLE: VISION BASED MOTION ESTIMATION AND NONLINEAR CONTROL , 1999 .

[2]  Sen Wang,et al.  VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem , 2017, AAAI.

[3]  Rogelio Lozano,et al.  Combining Stereo Vision and Inertial Navigation System for a Quad-Rotor UAV , 2011, J. Intell. Robotic Syst..

[4]  Roland Siegwart,et al.  A robust and modular multi-sensor fusion approach applied to MAV navigation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[6]  Luca Carlone,et al.  Visual-Inertial Navigation Algorithm Development Using Photorealistic Camera Simulation in the Loop , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Jürgen Leitner,et al.  Visual Servoing from Deep Neural Networks , 2017, RSS 2017.

[9]  Flavio Fontana,et al.  Simultaneous State Initialization and Gyroscope Bias Calibration in Visual Inertial Aided Navigation , 2017, IEEE Robotics and Automation Letters.

[10]  CarloneLuca,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2017 .

[11]  Mohamed S. Shehata,et al.  Image Matching Using SIFT, SURF, BRIEF and ORB: Performance Comparison for Distorted Images , 2017, ArXiv.

[12]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Richard M. Murray,et al.  A bio-plausible design for visual pose stabilization , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[17]  Robert E. Mahony,et al.  Landing a VTOL Unmanned Aerial Vehicle on a Moving Platform Using Optical Flow , 2012, IEEE Transactions on Robotics.

[18]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[19]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[20]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[21]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[22]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[23]  Yi Liu,et al.  Monocular Visual-Inertial SLAM: Continuous Preintegration and Reliable Initialization , 2017, Sensors.

[24]  Davide Scaramuzza,et al.  A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[27]  Antonio Bicchi,et al.  Visual Servoing in the Large , 2009, Int. J. Robotics Res..

[28]  Franziska Meier,et al.  SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control , 2017, ArXiv.

[29]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[30]  Luca Carlone,et al.  Attention and anticipation in fast visual-inertial navigation , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Sen Wang,et al.  DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Roberto Cipolla,et al.  Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[34]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.