A Dimensional Comparison between Evolutionary Algorithm and Deep Reinforcement Learning Methodologies for Autonomous Surface Vehicles with Water Quality Sensors

The monitoring of water resources using Autonomous Surface Vehicles with water-quality sensors has been a recent approach due to the advances in unmanned transportation technology. The Ypacaraí Lake, the biggest water resource in Paraguay, suffers from a major contamination problem because of cyanobacteria blooms. In order to supervise the blooms using these on-board sensor modules, a Non-Homogeneous Patrolling Problem (a NP-hard problem) must be solved in a feasible amount of time. A dimensionality study is addressed to compare the most common methodologies, Evolutionary Algorithm and Deep Reinforcement Learning, in different map scales and fleet sizes with changes in the environmental conditions. The results determined that Deep Q-Learning overcomes the evolutionary method in terms of sample-efficiency by 50–70% in higher resolutions. Furthermore, it reacts better than the Evolutionary Algorithm in high space-state actions. In contrast, the evolutionary approach shows a better efficiency in lower resolutions and needs fewer parameters to synthesize robust solutions. This study reveals that Deep Q-learning approaches exceed in efficiency for the Non-Homogeneous Patrolling Problem but with many hyper-parameters involved in the stability and convergence.

[1]  Yu-Liang Hsu,et al.  Autonomous Water Quality Monitoring and Water Surface Cleaning for Unmanned Surface Vehicle , 2021, Sensors.

[2]  David Gesbert,et al.  UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning , 2020, ArXiv.

[3]  Shou-De Lin,et al.  ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models , 2018, ArXiv.

[4]  G. Zolezzi,et al.  Eutrophication, Research and Management History of the Shallow Ypacaraí Lake (Paraguay) , 2018, Sustainability.

[5]  Daniel Gutiérrez-Reina,et al.  A survey on unmanned aerial and aquatic vehicle multi-hop networks: Wireless communications, evaluation tools and applications , 2018, Comput. Commun..

[6]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[7]  Daniel Gutiérrez Reina,et al.  A Path Planning Approach of an Autonomous Surface Vehicle for Water Quality Monitoring Using Evolutionary Computation , 2018 .

[8]  Sergio L. Toral Marín,et al.  A Deep Reinforcement Learning Approach for the Patrolling Problem of Water Resources Through Autonomous Surface Vehicles: The Ypacarai Lake Case , 2020, IEEE Access.

[9]  Jie Wang,et al.  Large-scale multi-agent reinforcement learning using image-based state representation , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[10]  Graçaliz Pereira Dimuro,et al.  An Extended Evolutionary Learning Approach For Multiple Robot Path Planning In A Multi-Agent Environment , 2013, 2013 IEEE Congress on Evolutionary Computation.

[11]  Conghao Zhou,et al.  Drone-Cell Trajectory Planning and Resource Allocation for Highly Mobile Networks: A Hierarchical DRL Approach , 2021, IEEE Internet of Things Journal.

[12]  T. Fukuda,et al.  Coordination in evolutionary multi-agent-robotic system using fuzzy and genetic algorithm , 1994 .

[13]  Stephen R. Marsland,et al.  Convergence Properties of Two (μ+λ) Evolutionary Algorithms on OneMax and Royal Roads Test Functions , 2011, IJCCI.

[14]  H. Ferreira,et al.  Autonomous bathymetry for risk assessment with ROAZ robotic surface vehicle , 2009, OCEANS 2009-EUROPE.

[15]  Giles Thomas,et al.  Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning , 2019, Applied Sciences.

[16]  Liujing Wang,et al.  Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning , 2019, IEEE Access.

[17]  Giulio Rosati,et al.  Working Cycle Sequence Optimization for Industrial Robots , 2020 .

[18]  Mauro Birattari,et al.  Tuning Metaheuristics - A Machine Learning Perspective , 2009, Studies in Computational Intelligence.

[19]  V. B. Surya Prasath,et al.  Choosing Mutation and Crossover Ratios for Genetic Algorithms - A Review with a New Dynamic Approach , 2019, Inf..

[20]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[21]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[22]  Y. Cheva Theoretical analysis of the multi-agent patrolling problem , 2004 .

[23]  Stefano Soatto,et al.  Rethinking the Hyperparameters for Fine-tuning , 2020, ICLR.

[24]  R. Gregor,et al.  Evolutionary Path Planning of an Autonomous Surface Vehicle for Water Quality Monitoring , 2016, 2016 9th International Conference on Developments in eSystems Engineering (DeSE).

[25]  Jingda Wu,et al.  Battery-Involved Energy Management for Hybrid Electric Bus Based on Expert-Assistance Deep Deterministic Policy Gradient Algorithm , 2020, IEEE Transactions on Vehicular Technology.

[26]  Wojciech Jaskowski,et al.  Evolving small-board Go players using coevolutionary temporal difference learning with archives , 2011, Int. J. Appl. Math. Comput. Sci..

[27]  Arvind Ramanathan,et al.  Distributed Bayesian optimization of deep reinforcement learning algorithms , 2020, J. Parallel Distributed Comput..

[28]  Madalina M. Drugan,et al.  Reinforcement learning versus evolutionary computation: A survey on hybrid algorithms , 2019, Swarm Evol. Comput..

[29]  Shimon Whiteson,et al.  Comparing evolutionary and temporal difference methods in a reinforcement learning domain , 2006, GECCO.

[30]  Daniel Gutiérrez-Reina,et al.  Comparison of Eulerian and Hamiltonian circuits for evolutionary-based path planning of an autonomous surface vehicle for monitoring Ypacarai Lake , 2019, J. Ambient Intell. Humaniz. Comput..

[31]  Gian Luca Foresti,et al.  Drone patrolling with reinforcement learning , 2019, ICDSC.

[32]  Richard S. Sutton,et al.  A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.

[33]  Daniel Gutiérrez-Reina,et al.  An evolutionary approach to constrained path planning of an autonomous surface vehicle for maximizing the covered area of Ypacarai Lake , 2019, Soft Comput..

[34]  Daniel Gutiérrez-Reina,et al.  A Comparison of Local Path Planning Techniques of Autonomous Surface Vehicles for Monitoring Applications: The Ypacarai Lake Case-study , 2020, Sensors.

[35]  Ivan Sekaj,et al.  Optimization of Robotic Arm Trajectory Using Genetic Algorithm , 2014 .

[36]  Jingda Wu,et al.  Battery Thermal- and Health-Constrained Energy Management for Hybrid Electric Bus Based on Soft Actor-Critic DRL Algorithm , 2021, IEEE Transactions on Industrial Informatics.

[37]  Chaymaa Lamini,et al.  Genetic Algorithm Based Approach for Autonomous Mobile Robot Path Planning , 2018 .

[38]  Yann Chevaleyre,et al.  Theoretical analysis of the multi-agent patrolling problem , 2004, Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004)..

[39]  Rajesh Elara Mohan,et al.  Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot , 2020 .

[40]  Bernard Bäker,et al.  Hyperparameter Optimization for Deep Reinforcement Learning in Vehicle Energy Management , 2019, ICAART.

[41]  Mohan Rajesh Elara,et al.  Reinforcement Learning-Based Complete Area Coverage Path Planning for a Modified hTrihex Robot , 2021, Sensors.

[42]  Christoph Ament,et al.  Modular AUV System with Integrated Real-Time Water Quality Analysis , 2018, Sensors.

[43]  Qiao Guo,et al.  Coordination of Multiple Autonomous Agents Using Naturally Generated Languages in Task Planning , 2019, Applied Sciences.

[44]  S. Luis,et al.  A Multiagent Deep Reinforcement Learning Approach for Path Planning in Autonomous Surface Vehicles: The Ypacaraí Lake Patrolling Case , 2021, IEEE Access.

[45]  Liam Paull,et al.  Path planning for multiple Unmanned Aerial Vehicles using genetic algorithms , 2009, 2009 Canadian Conference on Electrical and Computer Engineering.

[46]  Xiaobing Yu,et al.  A constrained differential evolution algorithm to solve UAV path planning in disaster scenarios , 2020, Knowl. Based Syst..

[47]  Sang-Hoon Bae,et al.  An Efficiency Enhancing Methodology for Multiple Autonomous Vehicles in an Urban Network Adopting Deep Reinforcement Learning , 2021, Applied Sciences.

[48]  Sergio L. Toral Marín,et al.  A Bayesian Optimization Approach for Water Resources Monitoring Through an Autonomous Surface Vehicle: The Ypacarai Lake Case Study , 2021, IEEE Access.

[49]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.