AUV path following controlled by modified Deep Deterministic Policy Gradient

Abstract This study proposes a Deep Deterministic Policy Gradient algorithm based on optimized sample pools and average motion critic network (OSAM-DDPG) to realize the path following control of autonomous underwater vehicles (AUVs). The ideas of optimizing the sampling mode and the evaluation of motion are proposed to improve the efficiency of algorithm. OSAM-DDPG is used to train the force-to-state mapping of an AUV's dynamical model to realize its control. In the simulation test, the OSAM-DDPG algorithm only needs some episodes to obtain the complete control strategy. Based on the experience gained from the training, the problems of various paths following in the interference environment can be addressed, and the results demonstrate that the effect of path following control based on OSAM-DDPG is better than S-plane.

[1]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[2]  Gerardo G. Acosta,et al.  Trajectory tracking algorithm for autonomous vehicles using adaptive reinforcement learning , 2015, OCEANS 2015 - MTS/IEEE Washington.

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Ali Keymasi Khalaji,et al.  Adaptive nonlinear control of an autonomous underwater vehicle , 2019, Trans. Inst. Meas. Control.

[5]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[6]  Lei Wan,et al.  Path following of an Underactuated AUV Based on Fuzzy Backstepping Sliding Mode Control , 2016 .

[7]  Jonathan P. How,et al.  Performance and Lyapunov Stability of a Nonlinear Path Following Guidance Method , 2007 .

[8]  Yiping Li,et al.  An AUV Adaptive Front-Tracking Algorithm Based on Data-Driven , 2019 .

[9]  Yi Han,et al.  Composite learning adaptive sliding mode control for AUV target tracking , 2019, Neurocomputing.

[10]  Omid Elhaki,et al.  A robust neural network approximation-based prescribed performance output-feedback controller for autonomous underwater vehicles with actuators saturation , 2020, Eng. Appl. Artif. Intell..

[11]  Lionel Lapierre,et al.  Nonlinear guidance and fuzzy control for three-dimensional path following of an underactuated autonomous underwater vehicle , 2017 .

[12]  Doina Precup,et al.  Metrics for Finite Markov Decision Processes , 2004, AAAI.

[13]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Dikai Liu,et al.  Sliding Mode Impedance Control for contact intervention of an I-AUV: Simulation and experimental validation , 2020 .

[16]  Qin Zhang,et al.  On intelligent risk analysis and critical decision of underwater robotic vehicle , 2017 .

[17]  Robert Givan,et al.  Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..

[18]  Ibraheem Kasim Ibraheem,et al.  Anti-Disturbance Compensator Design for Unmanned Aerial Vehicle , 2019, Journal of Engineering.

[19]  Gun Rae Cho,et al.  Horizontal Trajectory Tracking of Underactuated AUV using Backstepping Approach , 2019 .

[20]  Cheng Wu,et al.  Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[22]  Hongde Qin,et al.  A novel adaptive second order sliding mode path following control for a portable AUV , 2018 .

[23]  Yihua Liu,et al.  A self-searching optimal ADRC for the pitch angle control of an underwater thermal glider in the vertical plane motion , 2018 .

[24]  Sun Yu-sha Improved Simulated Annealing Algorithm and Its Application in Adjusting of S Plane Parameters in AUV Motion Control , 2013 .

[25]  Zhenyu Shi,et al.  Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle , 2017, 2017 36th Chinese Control Conference (CCC).

[26]  Yueming Li,et al.  Thruster fault diagnosis method based on Gaussian particle filter for autonomous underwater vehicles , 2016 .