Curved Path Following with Deep Reinforcement Learning: Results from Three Vessel Models

This paper proposes a methodology for solving the curved path following problem for underactuated vehicles under unknown ocean current influence using deep reinforcement learning. Three dynamic models of high complexity are employed to simulate the motions of a mariner vessel, a container vessel and a tanker. The policy search algorithm is tasked to find suitable steering policies, without any prior info about the vessels or their environment. First, we train the algorithm to find a policy for tackling the straight line following problem for each of the simulated vessels and then perform transfer learning to extend the policies to the curved-path case. This turns out to be a much faster process compared to training directly for curved paths, while achieving indistinguishable performance.

[1]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[2]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[3]  W B Van Berlekom,et al.  MANEUVERING OF LARGE TANKERS , 1972 .

[4]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5]  Anastasios M. Lekkas,et al.  Straight-Path Following for Underactuated Marine Vessels using Deep Reinforcement Learning , 2018 .

[6]  Roger Skjetne,et al.  Line-of-sight path following of underactuated marine craft , 2003 .

[7]  Jing Sun,et al.  Path following of underactuated marine surface vessels using line-of-sight based model predictive control ☆ , 2010 .

[8]  Felipe Leno da Silva,et al.  Towards Knowledge Transfer in Deep Reinforcement Learning , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[9]  Antonio M. Pascoal,et al.  Nonlinear path following with applications to the control of autonomous underwater vehicles , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[10]  João P. Hespanha,et al.  Trajectory-Tracking and Path-Following of Underactuated Autonomous Vehicles With Parametric Modeling Uncertainty , 2007, IEEE Transactions on Automatic Control.

[11]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[12]  Zhenyu Shi,et al.  Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle , 2017, 2017 36th Chinese Control Conference (CCC).

[13]  Zhang Weidong,et al.  Neural-network-based reinforcement learning control for path following of underactuated ships , 2016, 2016 35th Chinese Control Conference (CCC).

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Chen Guo,et al.  Path-following control of underactuated ships using actor-critic reinforcement learning with MLP neural networks , 2016, 2016 Sixth International Conference on Information Science and Technology (ICIST).

[16]  Roger Skjetne,et al.  Line-of-sight path-following along regularly parametrized curves solved as a generic maneuvering problem , 2011, IEEE Conference on Decision and Control and European Control Conference.

[17]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[18]  Thor I. Fossen,et al.  Trajectory tracking and ocean current estimation for marine underactuated vehicles , 2014, 2014 IEEE Conference on Control Applications (CCA).

[19]  Thor I. Fossen,et al.  Integral LOS Path Following for Curved Paths Based on a Monotone Cubic Hermite Spline Parametrization , 2014, IEEE Transactions on Control Systems Technology.

[20]  Kristin Ytterstad Pettersen,et al.  Global kappa-exponential way-point maneuvering of ships: Theory and experiments , 2006, Autom..

[21]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[22]  Dan Wang,et al.  ESO-Based Line-of-Sight Guidance Law for Path Following of Underactuated Marine Surface Vehicles With Exact Sideslip Compensation , 2017, IEEE Journal of Oceanic Engineering.

[23]  Kensaku Nomoto,et al.  On the Coupled Motion of Steering and Rolling of a High Speed Container Ship , 1981 .

[24]  Kristin Ytterstad Pettersen,et al.  Line-of-sight curved path following for underactuated USVs and AUVs in the horizontal plane under the influence of ocean currents , 2016, 2016 24th Mediterranean Conference on Control and Automation (MED).

[25]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[26]  TaeChoong Chung,et al.  Deep reinforcement learning algorithms for steering an underactuated ship , 2017, 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).