COLREGs-compliant multiship collision avoidance based on deep reinforcement learning

Abstract Developing a high-level autonomous collision avoidance system for ships that can operate in an unstructured and unpredictable environment is challenging. Particularly in congested sea areas, each ship should make decisions continuously to avoid collisions with other ships in a busy and complex waterway. Furthermore, recent reports indicate that a large number of marine collision accidents are caused by or are related to human decision failures concerning a lack of situational awareness and failure to comply with the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs). In this study, we propose an efficient method to overcome multiship collision avoidance problems based on the Deep Reinforcement Learning (DRL) algorithm by expanding our previous study (Zhao et al., 2019). The proposed method directly maps the states of encountered ships to an ownship's steering commands in terms of rudder angle using the Deep Neural Network (DNN). This DNN is trained over multiple ships in rich encountering situations using the policy-gradient based DRL algorithm. To address multiple encountered ships, we classify them into four regions based on COLREGs, and consider only the nearest ship in each region. We validate the proposed collision avoidance method in a variety of simulated scenarios with thorough performance evaluations, and demonstrate that the final DRL controller can obtain time efficient and collision-free paths for multiple ships. Simulation results indicate that multiple ships can avoid collisions with each other while following their own predefined paths simultaneously. In addition, the proposed approach demonstrates its excellent adaptability to unknown complex environments with various encountered ships.

[1]  Sergey Levine,et al.  Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.

[2]  Xin Wang,et al.  The ship maneuverability based collision avoidance dynamic support system in close-quarters situation , 2017 .

[3]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[4]  Faruk Kazi,et al.  Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle , 2019, Journal of Marine Science and Application.

[5]  Chung Choo Chung,et al.  Autonomous braking system via deep reinforcement learning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[6]  Chen Guo,et al.  Automatic collision avoidance of multiple ships based on deep Q-learning , 2019, Applied Ocean Research.

[7]  Tristan Perez,et al.  Ship Collision Avoidance and COLREGS Compliance Using Simulation-Based Control Behavior Selection With Predictive Hazard Assessment , 2016, IEEE Transactions on Intelligent Transportation Systems.

[8]  Thor I. Fossen,et al.  Handbook of Marine Craft Hydrodynamics and Motion Control , 2011 .

[9]  Axel Hahn,et al.  Nonlinear Model Predictive Control for trajectory tracking and collision avoidance of underactuated vessels with disturbances , 2018, Ocean Engineering.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Pengfei Chen,et al.  Quantitative analysis of COLREG rules and seamanship for autonomous collision avoidance at open sea , 2017 .

[12]  Jonathan P. How,et al.  Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Weidong Zhang,et al.  Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels , 2018, Neurocomputing.

[15]  Xinping Yan,et al.  A distributed anti-collision decision support formulation in multi-ship encounter situations under COLREGs , 2015 .

[16]  Richard Bucknall,et al.  Collision risk assessment for ships , 2010 .

[17]  Morten Breivik,et al.  MPC-Based mid-level collision avoidance for asvs using nonlinear programming , 2017, 2017 IEEE Conference on Control Technology and Applications (CCTA).

[18]  Shalabh Bhatnagar,et al.  Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge , 2018, IEEE Transactions on Intelligent Transportation Systems.

[19]  H. Ligteringen,et al.  Study on collision avoidance in busy waterways by using AIS data , 2010 .

[20]  Wang Li,et al.  A real-time collision avoidance learning system for Unmanned Surface Vessels , 2016, Neurocomputing.

[21]  Wei Wang,et al.  Design. Modeling, and Nonlinear Model Predictive Tracking Control of a Novel Autonomous Surface Vehicle , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Roman Smierzchalski,et al.  Ships' domains as collision risk at sea in the evolutionary method of trajectory planning , 2005, Information Processing and Security Systems.

[23]  Hao Zhang,et al.  Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Takamitsu Matsubara,et al.  Reinforcement Learning Ship Autopilot: Sample-efficient and Model Predictive Control-based Approach , 2019, ArXiv.