论文信息 - A Deep Reinforcement Learning Approach for the Patrolling Problem of Water Resources Through Autonomous Surface Vehicles: The Ypacarai Lake Case

A Deep Reinforcement Learning Approach for the Patrolling Problem of Water Resources Through Autonomous Surface Vehicles: The Ypacarai Lake Case

Autonomous Surfaces Vehicles (ASV) are incredibly useful for the continuous monitoring and exploring task of water resources due to their autonomy, mobility, and relative low cost. In the path planning context, the patrolling problem is usually addressed with heuristics approaches, such as Genetic Algorithms (GA) or Reinforcement Learning (RL) because of the complexity and high dimensionality of the problem. In this paper, the patrolling problem of Ypacarai Lake (Asunción, Paraguay) has been formulated as a Markov Decision Process (MDP) for two possible cases: the homogeneous and the non-homogeneous scenarios. A tailored reward function has been designed for the non-homogeneous case. Two Deep Reinforcement Learning algorithms such as Deep Q-Learning (DQL) and Double Deep Q-Learning (DDQL) have been evaluated to solve the patrolling problem. Furthermore, due to the high number of parameters and hyperparameters involved in the algorithms, a thorough search has been conducted to find the best values for training the neural networks and the proposed reward function. According to the results, a suitable configuration of the parameters allows better results for coverage, obtaining more than the 93% of the lake surface on average. In addition, the proposed approach achieves higher sample redundancy of important zones than other common-used algorithms for non-homogeneous coverage path planning such as Policy Gradient, lawnmower algorithm or random exploration, achieving an 64% improvement of the mean time between visits.

[1] Gian Luca Foresti,et al. Drone patrolling with reinforcement learning , 2019, ICDSC.

[2] Yoshua Bengio,et al. Revisiting Fundamentals of Experience Replay , 2020, ICML.

[3] Taua M. Cabreira,et al. Strategies for Patrolling Missions with Multiple UAVs , 2019, Journal of Intelligent & Robotic Systems.

[4] Mario Eduardo,et al. Reactive evolutionary path planning for autonomous surface vehicles in lake environments , 2019 .

[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[6] Qixin Sha,et al. Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle , 2020, IEEE Access.

[7] Davide Spinello,et al. Multi-Agent Area Coverage Control Using Reinforcement Learning , 2016, FLAIRS.

[8] Analysis of Contaminant Transport under Wind Conditions on the Surface of a Shallow Lake , 2016 .

[9] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11] Nik Bessis,et al. Intelligent Online Learning Strategy for an Autonomous Surface Vehicle in Lake Environments Using Evolutionary Computation , 2019, IEEE Intelligent Transportation Systems Magazine.

[12] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[13] Daniel Gutiérrez-Reina,et al. An evolutionary approach to constrained path planning of an autonomous surface vehicle for maximizing the covered area of Ypacarai Lake , 2019, Soft Comput..

[14] Ian R. Fasel,et al. Optimization on a Budget: A Reinforcement Learning Approach , 2008, NIPS.

[15] Yue Zhang,et al. Reduce UAV Coverage Energy Consumption through Actor-Critic Algorithm , 2019, 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN).

[16] Ibrahim A. Hameed,et al. Coverage path planning software for autonomous robotic lawn mower using Dubins' curve , 2017, 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[17] Madalina M. Drugan,et al. Reinforcement learning versus evolutionary computation: A survey on hybrid algorithms , 2019, Swarm Evol. Comput..

[18] Andrew Cahill. Catastrophic Forgetting in Reinforcement-Learning Environments , 2010 .

[19] Anna M. Michalak,et al. Challenges in tracking harmful algal blooms: A synthesis of evidence from Lake Erie , 2015 .

[20] Peter Stone,et al. Deep R-Learning for Continual Area Sweeping , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21] MengChu Zhou,et al. A survey of multi-robot regular and adversarial patrolling , 2019, IEEE/CAA Journal of Automatica Sinica.

[22] David Gesbert,et al. UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning , 2020, ArXiv.

[23] Jian Xiao,et al. A Distributed Multi-Agent Dynamic Area Coverage Algorithm Based on Reinforcement Learning , 2020, IEEE Access.

[24] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[25] Yann Chevaleyre,et al. Theoretical analysis of the multi-agent patrolling problem , 2004, Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004)..

[26] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..

[27] Horst Bischof,et al. Semi-supervised image classification with huberized Laplacian Support Vector Machines , 2013, 2013 IEEE 9th International Conference on Emerging Technologies (ICET).

[28] Derui Ding,et al. Path Planning via an Improved DQN-Based Learning Policy , 2019, IEEE Access.

[29] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[30] Uwe D. Hanebeck,et al. Density trees for efficient nonlinear state estimation , 2010, 2010 13th International Conference on Information Fusion.

[31] Rajesh Elara Mohan,et al. Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot , 2020 .

[32] Daniel Gutiérrez-Reina,et al. Comparison of Eulerian and Hamiltonian circuits for evolutionary-based path planning of an autonomous surface vehicle for monitoring Ypacarai Lake , 2019, J. Ambient Intell. Humaniz. Comput..

[33] Daniel Gutiérrez-Reina,et al. A Comparison of Local Path Planning Techniques of Autonomous Surface Vehicles for Monitoring Applications: The Ypacarai Lake Case-study , 2020, Sensors.

[34] Zendai Kashino,et al. Deep Reinforcement Learning Robot for Search and Rescue Applications: Exploration in Unknown Cluttered Environments , 2019, IEEE Robotics and Automation Letters.