UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning

Coverage path planning (CPP) is the task of designing a trajectory that enables a mobile agent to travel over every point of an area of interest. We propose a new method to control an unmanned aerial vehicle (UAV) carrying a camera on a CPP mission with random start positions and multiple options for landing positions in an environment containing no-fly zones. While numerous approaches have been proposed to solve similar CPP problems, we leverage end-to-end reinforcement learning (RL) to learn a control policy that generalizes over varying power constraints for the UAV. Despite recent improvements in battery technology, the maximum flying range of small UAVs is still a severe constraint, which is exacerbated by variations in the UAV's power consumption that are hard to predict. By using map-like input channels to feed spatial information through convolutional network layers to the agent, we are able to train a double deep Q-network (DDQN) to make control decisions for the UAV, balancing limited power budget and coverage goal. The proposed method can be applied to a wide variety of environments and harmonizes complex goal structures with system constraints.

[1]  Marc Carreras,et al.  A survey on coverage path planning for robotics , 2013, Robotics Auton. Syst..

[2]  Gian Luca Foresti,et al.  Drone patrolling with reinforcement learning , 2019, ICDSC.

[3]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Luis Rodolfo Garcia Carrillo,et al.  An Integrated Traveling Salesman and Coverage Path Planning Problem for Unmanned Aircraft Systems , 2019, IEEE Control Systems Letters.

[6]  S.X. Yang,et al.  A neural network approach to complete coverage path planning , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Taua M. Cabreira,et al.  Survey on Coverage Path Planning with Unmanned Aerial Vehicles , 2019, Drones.

[8]  Samet Akcay,et al.  Online Deep Reinforcement Learning for Autonomous UAV Navigation and Exploration of Outdoor Environments , 2019, ArXiv.

[9]  Esther M. Arkin,et al.  Angewandte Mathematik Und Informatik Universit at Zu K Oln Approximation Algorithms for Lawn Mowing and Milling Ss Andor P.fekete Center for Parallel Computing Universitt at Zu Kk Oln D{50923 Kk Oln Germany Approximation Algorithms for Lawn Mowing and Milling , 2022 .

[10]  Ioannis M. Rekleitis,et al.  Optimal complete terrain coverage using an Unmanned Aerial Vehicle , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Howie Choset,et al.  Coverage Path Planning: The Boustrophedon Cellular Decomposition , 1998 .

[12]  Ioannis M. Rekleitis,et al.  Optimal coverage of a known arbitrary environment , 2010, 2010 IEEE International Conference on Robotics and Automation.

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  Qingqing Wu,et al.  Accessing From the Sky: A Tutorial on UAV Communications for 5G and Beyond , 2019, Proceedings of the IEEE.