Collision avoidance for an unmanned surface vehicle using deep reinforcement learning

Abstract In this paper, a deep reinforcement learning (DRL)-based collision avoidance method is proposed for an unmanned surface vehicle (USV). This approach is applicable to the decision-making stage of collision avoidance, which determines whether the avoidance is necessary, and if so, determines the direction of the avoidance maneuver. To utilize the visual recognition capability of deep neural networks as a tool for analyzing the complex and ambiguous situations that are typically encountered, a grid map representation of the ship encounter situation was suggested. For the composition of the DRL network, we proposed a neural network architecture and semi-Markov decision process model that was specially designed for the USV collision avoidance problem. The proposed DRL network was trained through repeated simulations of collision avoidance. After the training process, the DRL network was implemented in collision avoidance experiments and simulations to evaluate its situation recognition and collision avoidance capability.

[1]  Jana Fuhrmann,et al.  Guidance And Control Of Ocean Vehicles , 2016 .

[2]  Ming Liu,et al.  Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Thomas Stenersen Guidance System for Autonomous Surface Vehicles , 2015 .

[4]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Michael Blaich,et al.  Mission integrated collision avoidance for USVs using laser range finder , 2015, OCEANS 2015 - Genova.

[6]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[7]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[8]  C. T. Stockel,et al.  Manoeuvring Times, Domains and Arenas , 1983 .

[9]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Rafal Szlapczynski,et al.  A Unified Measure Of Collision Risk Derived From The Concept Of A Ship Domain , 2006, Journal of Navigation.

[11]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[12]  Angelo Cangelosi,et al.  Toward End-to-End Control for UAV Autonomous Landing via Deep Reinforcement Learning , 2018, 2018 International Conference on Unmanned Aircraft Systems (ICUAS).

[13]  Richard Bucknall,et al.  Collision risk assessment for ships , 2010 .

[14]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[15]  Sergey Levine,et al.  Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16]  T. Coldwell Marine Traffic Behaviour in Restricted Waters , 1983, Journal of Navigation.

[17]  Thor I. Fossen,et al.  Marine Control Systems Guidance, Navigation, and Control of Ships, Rigs and Underwater Vehicles , 2002 .

[18]  Terrance L. Huntsberger,et al.  Stereo vision–based navigation for autonomous surface vessels , 2011, J. Field Robotics.

[19]  Key-Pyo Rhee,et al.  A study on the collision avoidance of a ship using neural networks and fuzzy logic , 2012 .

[20]  José Santos-Victor,et al.  Avoiding moving obstacles: the forbidden velocity map , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Jason Gu,et al.  The obstacle avoidance planning of USV based on improved artificial potential field , 2014, 2014 IEEE International Conference on Information and Automation (ICIA).

[22]  R. Sutton Introduction: The Challenge of Reinforcement Learning , 1992 .

[23]  Shie Mannor,et al.  Learning Embedded Maps of Markov Processes , 2001, ICML.

[24]  Debasish Ghose,et al.  Obstacle avoidance in a dynamic environment: a collision cone approach , 1998, IEEE Trans. Syst. Man Cybern. Part A.

[25]  Marios M. Polycarpou,et al.  An analytical framework for local feedforward networks , 1998, IEEE Trans. Neural Networks.

[26]  Pieter Abbeel,et al.  Apprenticeship learning for helicopter control , 2009, CACM.

[27]  Giuseppe Casalino,et al.  A three-layered architecture for real time path planning and obstacle avoidance for surveillance USVs operating in harbour fields , 2009, OCEANS 2009-EUROPE.

[28]  Jacoby Larson,et al.  Autonomous navigation and obstacle avoidance for unmanned surface vehicles , 2006, SPIE Defense + Commercial Sensing.

[29]  Michael T. Wolf,et al.  Safe Maritime Autonomous Navigation With COLREGS, Using Velocity Obstacles , 2014, IEEE Journal of Oceanic Engineering.

[30]  Jongho Shin,et al.  Adaptive Path-Following Control for an Unmanned Surface Vessel Using an Identified Dynamic Model , 2017, IEEE/ASME Transactions on Mechatronics.

[31]  D J Stilwell,et al.  Control-Oriented Planar Motion Modeling of Unmanned Surface Vehicles , 2010, OCEANS 2010 MTS/IEEE SEATTLE.

[32]  Sanjay Sharma,et al.  Autonomous Vehicular Landings on the Deck of an Unmanned Surface Vehicle using Deep Reinforcement Learning , 2019, Robotica.

[33]  Elisabeth M. Goodwin,et al.  A Statistical Study of Ship Domains , 1973, Journal of Navigation.

[34]  Kazuhiko Hasegawa,et al.  AUTOMATIC COLLISION AVOIDANCE SYSTEM FOR SHIPS USING FUZZY CONTROL , 1987 .

[35]  George W. Irwin,et al.  A review on improving the autonomy of unmanned surface vehicles through intelligent collision avoidance manoeuvres , 2012, Annu. Rev. Control..

[36]  Xiaolin Zhu,et al.  DOMAIN AND ITS MODEL BASED ON NEURAL NETWORKS , 2001 .

[37]  P. V. Davis,et al.  A Computer Simulation of Marine Traffic Using Domains and Arenas , 1980, Journal of Navigation.

[38]  Robert Evans,et al.  A Maneuvering-Board Approach to Path Planning with Moving Obstacles , 1989, IJCAI.

[39]  Jose Miguel Almeida,et al.  Radar based collision detection developments on USV ROAZ II , 2009, OCEANS 2009-EUROPE.

[40]  Nakwan Kim,et al.  Dynamic model identification of unmanned surface vehicles using deep learning network , 2018, Applied Ocean Research.

[41]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[42]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[43]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[44]  Timothy W. McLain,et al.  Vector Field Path Following for Miniature Air Vehicles , 2007, IEEE Transactions on Robotics.

[45]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[46]  Wei-yuan Hwang Application of system identification to ship maneuvering , 1980 .

[47]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[48]  Michael Blaich,et al.  Collision Avoidance for Vessels Using a Low-Cost Radar Sensor , 2014 .

[49]  Anderson Lebbad,et al.  A Bayesian algorithm for vision based navigation of autonomous surface vehicles , 2015, 2015 IEEE 7th International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM).

[50]  Huajin Qu,et al.  Wind feed-forward control of a USV , 2015, OCEANS 2015 - Genova.

[51]  Helene Myre Collision Avoidance for Autonomous Surface Vehicles Using Velocity Obstacle and Set-Based Guidance. , 2016 .

[52]  Jacoby Larson,et al.  Advances in Autonomous Obstacle Avoidance for Unmanned Surface Vehicles , 2007 .

[53]  Marc Carreras Pérez A proposal of a behavior-based control architecture with reinforcement learning for an autonomous underwater robot , 2003 .