Autonomous UAV Navigation: A DDPG-Based Deep Reinforcement Learning Approach

In this paper, we propose an autonomous UAV path planning framework using deep reinforcement learning approach. The objective is to employ a self-trained UAV as a flying mobile unit to reach spatially distributed moving or static targets in a given three dimensional urban area. In this approach, a Deep Deterministic Policy Gradient (DDPG) with continuous action space is designed to train the UAV to navigate through or over the obstacles to reach its assigned target. A customized reward function is developed to minimize the distance separating the UAV and its destination while penalizing collisions. Numerical simulations investigate the behavior of the UAV in learning the environment and autonomously determining trajectories for different selected scenarios.

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  Hakim Ghazzai,et al.  Q-learning based Routing Scheduling For a Multi-Task Autonomous Agent , 2019, 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS).

[3]  Saraju P. Mohanty,et al.  Everything You Wanted to Know About Smart Cities , 2016, IEEE Consumer Electron. Mag..

[4]  Valentyn N. Sichkar Reinforcement Learning Algorithms in Global Path Planning for Mobile Robot , 2019, 2019 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM).

[5]  Wu Jiang,et al.  Path planning for UAVS based on improved artificial potential field method through changing the repulsive potential function , 2016, 2016 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC).

[6]  Xiaojia Xiang,et al.  A Path Planning Algorithm for UAV Based on Improved Q-Learning , 2018, 2018 2nd International Conference on Robotics and Automation Sciences (ICRAS).

[7]  Hakim Ghazzai,et al.  Low-Altitude Navigation for Multi-Rotor Drones in Urban Areas , 2019, IEEE Access.

[8]  Xing Wang,et al.  UAV path planning based on receding horizon control with adaptive strategy , 2017, 2017 29th Chinese Control And Decision Conference (CCDC).

[9]  Wensu Xu,et al.  Path Planning of UAV for Oilfield Inspection Based on Improved Grey Wolf Optimization Algorithm , 2019, 2019 Chinese Control And Decision Conference (CCDC).

[10]  Hakim Ghazzai,et al.  Future UAV-Based ITS: A Comprehensive Scheduling Framework , 2019, IEEE Access.

[11]  Hakim Ghazzai,et al.  Collision-free Navigation and Efficient Scheduling for Fleet of Multi-Rotor Drones in Smart City , 2019, 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS).

[12]  Tao Jiang,et al.  Path planning under obstacle-avoidance constraints based on ant colony optimization algorithm , 2017, 2017 IEEE 17th International Conference on Communication Technology (ICCT).

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  Hakim Ghazzai,et al.  Joint Position and Travel Path Optimization for Energy Efficient Wireless Data Gathering Using Unmanned Aerial Vehicles , 2019, IEEE Transactions on Vehicular Technology.