Deep reinforcement learning-based controller for path following of an unmanned surface vehicle

Abstract In this paper, a deep reinforcement learning (DRL)-based controller for path following of an unmanned surface vehicle (USV) is proposed. The proposed controller can self-develop a vehicle’s path following capability by interacting with the nearby environment. A deep deterministic policy gradient (DDPG) algorithm, which is an actor-critic-based reinforcement learning algorithm, was adapted to capture the USV’s experience during the path-following trials. A Markov decision process model, which includes the state, action, and reward formulation, specially designed for the USV path-following problem is suggested. The control policy was trained with repeated trials of path-following simulation. The proposed method’s path-following and self-learning capabilities were validated through USV simulation and a free-running test of the full-scale USV.

[1]  Ivan R. Bertaska Intelligent Supervisory Switching Control of Unmanned Surface Vehicles , 2016 .

[2]  Weidong Zhang,et al.  Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels , 2018, Neurocomputing.

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Craig A. Woolsey,et al.  Modeling, Identification, and Control of an Unmanned Surface Vehicle , 2013, J. Field Robotics.

[5]  Hriday Bavle,et al.  A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform , 2018, Journal of Intelligent & Robotic Systems.

[6]  Jongho Shin,et al.  Adaptive Path-Following Control for an Unmanned Surface Vessel Using an Identified Dynamic Model , 2017, IEEE/ASME Transactions on Mechatronics.

[7]  Gerardo G. Acosta,et al.  Trajectory tracking algorithm for autonomous vehicles using adaptive reinforcement learning , 2015, OCEANS 2015 - MTS/IEEE Washington.

[8]  Ivan R. Bertaska,et al.  Experimental Evaluation of Supervisory Switching Control for Unmanned Surface Vehicles , 2019, IEEE Journal of Oceanic Engineering.

[9]  Jinwhan Kim,et al.  Path optimization for marine vehicles in ocean currents using reinforcement learning , 2016 .

[10]  Marco Bibuli,et al.  Path-following algorithms and experiments for an unmanned surface vehicle , 2009 .

[11]  Timothy W. McLain,et al.  Vector Field Path Following for Miniature Air Vehicles , 2007, IEEE Transactions on Robotics.

[12]  Jerzy Garus,et al.  Using of soft computing techniques to control of underwater robot , 2010, 2010 15th International Conference on Methods and Models in Automation and Robotics.

[13]  Yang Li,et al.  Adaptive Neural Network Control of AUVs With Control Input Nonlinearities Using Reinforcement Learning , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14]  Nakwan Kim,et al.  Collision avoidance for an unmanned surface vehicle using deep reinforcement learning , 2020 .

[15]  Antonios Tsourdos,et al.  Efficient path following algorithm for unmanned surface vehicle , 2016, OCEANS 2016 - Shanghai.

[16]  Wei Meng,et al.  Global sliding mode based adaptive neural network path following control for underactuated surface vessels with uncertain dynamics , 2012, 2012 Third International Conference on Intelligent Control and Information Processing.

[17]  Bingbing Qiu,et al.  Adaptive LOS Path Following for a Podded Propulsion Unmanned Surface Vehicle with Uncertainty of Model and Actuator Saturation , 2017 .

[18]  Joohyun Woo,et al.  Vector Field based Guidance Method for Docking of an Unmanned Surface Vehicle , 2016 .

[19]  Nakwan Kim,et al.  Dynamic model identification of unmanned surface vehicles using deep learning network , 2018, Applied Ocean Research.

[20]  J. Majohr,et al.  Modelling, simulation and control of an autonomous surface marine vehicle for surveying applications Measuring Dolphin MESSIN , 2006 .

[21]  S. J. Corfield,et al.  Unmanned surface vehicles - game changing technology for naval operations , 2006 .

[22]  Jianhua Wang,et al.  Straight path following of unmanned surface vehicle under flow disturbance , 2016, OCEANS 2016 - Shanghai.

[23]  Weidong Zhang,et al.  Robust adaptive formation control of underactuated autonomous surface vessels based on MLP and DOB , 2018, Nonlinear Dynamics.

[24]  Thor I. Fossen,et al.  Marine Control Systems Guidance, Navigation, and Control of Ships, Rigs and Underwater Vehicles , 2002 .

[25]  Andres El-Fakdi,et al.  Semi-online neural-Q/spl I.bar/leaming for real-time robot learning , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[26]  J Magalhães,et al.  Reinforcement learning: The application to autonomous biomimetic underwater vehicles control , 2018, IOP Conference Series: Earth and Environment.

[27]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[28]  Thor I. Fossen,et al.  A Time-Varying Lookahead Distance Guidance Law for Path Following , 2012 .

[29]  Guoqing Zhang,et al.  Novel DVS guidance and path-following control for underactuated ships in presence of multiple static and moving obstacles , 2018, Ocean Engineering.

[30]  Rubo Zhang,et al.  An adaptive obstacle avoidance algorithm for unmanned surface vehicle in complicated marine environments , 2014, IEEE/CAA Journal of Automatica Sinica.