Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning

To achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two reinforcement learning schemes were investigated: one with continuous state and action spaces, deep deterministic policy gradient (DDPG), and one with continuous state but discrete action spaces, deep Q network (DQN). For DQN, the discrete actions were selected as step changes in the control input signals. The performance of the reinforcement learning strategies was compared with classical and optimal control techniques. The control actions selected by DDPG suffer from chattering effects due to a hyperbolic tangent layer in the actor. Conversely, DQN presents the best compromise between short docking time and low control effort, whilst meeting the docking requirements. Whereas the reinforcement learning algorithms present a very high computational cost at training time, they are five orders of magnitude faster than optimal control at deployment time, thus enabling an on-line implementation. Therefore, reinforcement learning achieves a performance similar to optimal control at a much lower computational cost at deployment, whilst also presenting a more general framework.

[1]  Kristin Y. Pettersen,et al.  Navigation and Probability Assessment for Successful AUV Docking Using USBL , 2015 .

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Roberto Furfaro,et al.  Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Powered Descent and Landing , 2018, ArXiv.

[4]  J. Partan,et al.  A long term vision for long-range ship-free deep ocean operations: Persistent presence through coordination of Autonomous Surface Vehicles and Autonomous Underwater Vehicles , 2012, 2012 IEEE/OES Autonomous Underwater Vehicles (AUV).

[5]  Laura Lindzey,et al.  The ARTEMIS under‐ice AUV docking system , 2018, J. Field Robotics.

[6]  Zhenyu Shi,et al.  Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle , 2017, 2017 36th Chinese Control Conference (CCC).

[7]  Gerardo G. Acosta,et al.  AUV Position Tracking Control Using End-to-End Deep Reinforcement Learning , 2018, OCEANS 2018 MTS/IEEE Charleston.

[8]  Stuart Anstee,et al.  Trim Calculation Methods for a Dynamical Model of the REMUS 100 Autonomous Underwater Vehicle , 2011 .

[9]  Pere Ridao,et al.  Sum of gaussian single beacon range-only localization for AUV homing , 2016, Annu. Rev. Control..

[10]  A. El-Fakdi,et al.  Autonomous underwater vehicle control using reinforcement learning policy search methods , 2005, Europe Oceans 2005.

[11]  Zheping Yan,et al.  Modeling, strategy and control of UUV for autonomous underwater docking recovery to moving platform , 2017, 2017 36th Chinese Control Conference (CCC).

[12]  Enrico Anderlini,et al.  Control of a Point Absorber Using Reinforcement Learning , 2016, IEEE Transactions on Sustainable Energy.

[13]  Yannick Allard,et al.  Unmanned Underwater Vehicle (UUV) Information Study , 2014 .

[14]  Konstantinos Kyriakopoulos,et al.  Persistent Autonomy: the Challenges of the PANDORA Project , 2012 .

[15]  Cheng Wu,et al.  Depth Control of Model-Free AUVs via Reinforcement Learning , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  Peter Henderson,et al.  An Introduction to Deep Reinforcement Learning , 2018, Found. Trends Mach. Learn..

[17]  Thor I. Fossen,et al.  Handbook of Marine Craft Hydrodynamics and Motion Control , 2011 .

[18]  R. Stokey,et al.  A docking system for REMUS, an autonomous underwater vehicle , 1997, Oceans '97. MTS/IEEE Conference Proceedings.

[19]  Christopher Michael Jewison,et al.  Guidance and control for multi-stage rendezvous and docking operations in the presence of uncertainty , 2017 .

[20]  Gordon G. Parker,et al.  Determining optimal state of charge for a military vehicle microgrid , 2014 .

[21]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[22]  O. Yakimenko Direct Method for Rapid Prototyping of Near-Optimal Aircraft Trajectories , 2000 .

[23]  Mamoru Minami,et al.  Dual-eyes Vision-based Docking System for Autonomous Underwater Vehicle: an Approach and Experiments , 2018, J. Intell. Robotic Syst..

[24]  Aksel Andreas Transeth,et al.  AUV Pipeline Following using Reinforcement Learning , 2010, ISR/ROBOTIK.

[25]  Arnold Neumaier,et al.  Introduction to Numerical Analysis , 2001 .

[26]  Ian H. Hutchinson A Student's Guide to Numerical Methods , 2015 .

[27]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[28]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[29]  Eduard Vidal,et al.  AUV homing and docking for remote operations , 2018 .

[30]  Jasbir S. Arora,et al.  Introduction to Optimum Design , 1988 .

[31]  Edoardo I. Sarda,et al.  Launch and Recovery of an Autonomous Underwater Vehicle From a Station-Keeping Unmanned Surface Vehicle , 2019, IEEE Journal of Oceanic Engineering.

[32]  Sen Wang,et al.  Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning , 2018, Robotics Auton. Syst..

[33]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[34]  Philip A. Wilson,et al.  Control and guidance approach using an Autonomous Underwater Vehicle , 2008 .

[35]  Robert Sutton,et al.  Autonomous Underwater Vehicle Retrieval Manoeuvre Using Artificial Intelligent Strategy , 2003 .

[36]  Giles Thomas,et al.  Control of a ROV carrying an object , 2018, Ocean Engineering.

[37]  R. McEwen,et al.  DOCKING CONTROL SYSTEM FOR A 21 ” DIAMETER AUV , 2007 .

[38]  Gordon G. Parker,et al.  Command Shaping Nonlinear Inputs Using Basis Functions , 2002 .

[39]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[40]  Rush D. Robinett,et al.  Generating swing-suppressed maneuvers for crane systems with rate saturation , 2003, IEEE Trans. Control. Syst. Technol..

[41]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[42]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[43]  Daniel Liberzon,et al.  Calculus of Variations and Optimal Control Theory: A Concise Introduction , 2012 .

[44]  Bo Wang,et al.  AUV docking experiments based on vision positioning using two cameras , 2015 .

[45]  Robert Sutton,et al.  Retrieval of an Autonomous Underwater Vehicle: An Interception Approach , 2003 .

[46]  Canjun Yang,et al.  Autonomous underwater vehicle docking system for cabled ocean observatory network , 2015 .

[47]  Kristin Y. Pettersen,et al.  Learning an AUV docking maneuver with a convolutional neural network , 2017, OCEANS 2017 – Anchorage.

[48]  Enrico Anderlini,et al.  Control of a Realistic Wave Energy Converter Model Using Least-Squares Policy Iteration , 2017, IEEE Transactions on Sustainable Energy.

[49]  Gabriel Oliver,et al.  I-AUV Docking and Panel Intervention at Sea , 2016, Sensors.

[50]  Pieter Abbeel,et al.  Apprenticeship learning and reinforcement learning with application to robotic control , 2008 .

[51]  B.W. Hobson,et al.  Docking Control System for a 54-cm-Diameter (21-in) AUV , 2008, IEEE Journal of Oceanic Engineering.

[52]  B. Bett,et al.  Autonomous Underwater Vehicles (AUVs): Their past, present and future contributions to the advancement of marine geoscience , 2014 .