A reinforcement learning control approach for underwater manipulation under position and torque constraints

In marine operations underwater manipulators play a primordial role. However, due to uncertainties in the dynamic model and disturbances caused by the environment, low-level control methods require great capabilities to adapt to change. Furthermore, under position and torque constraints the requirements for the control system are greatly increased. Reinforcement learning is a data driven control technique that can learn complex control policies without the need of a model. The learning capabilities of these type of agents allow for great adaptability to changes in the operative conditions. In this article we present a novel reinforcement learning low-level controller for the position control of an underwater manipulator under torque and position constraints. The reinforcement learning agent is based on an actor-critic architecture using sensor readings as state information. Simulation results using the Reach Alpha 5 underwater manipulator show the advantages of the proposed control strategy.

[1]  Andrew A. Goldenberg,et al.  Force and position control of manipulators during constrained motion tasks , 1989, IEEE Trans. Robotics Autom..

[2]  Amir Mehdi Yazdani,et al.  A survey of underwater docking guidance systems , 2020, Robotics Auton. Syst..

[3]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Sen Wang,et al.  Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning , 2018, Robotics Auton. Syst..

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[8]  Jana Fuhrmann,et al.  Guidance And Control Of Ocean Vehicles , 2016 .

[9]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[10]  Luis G. Crespo,et al.  Stochastic Optimal Control via Bellman’s Principle , 2004 .

[11]  Francesco Maurelli,et al.  Reinforcement learning in a behaviour-based control architecture for marine archaeology , 2015, OCEANS 2015 - Genova.

[12]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[13]  Jian-Qiao Sun,et al.  Solution of Fixed Final State Optimal Control Problems via Simple Cell Mapping , 2000 .

[14]  Mariano De Paula,et al.  Incremental Q-learning strategy for adaptive PID control of mobile robots , 2017, Expert Syst. Appl..

[15]  Shuai Li,et al.  Deep Recurrent Neural Networks Based Obstacle Avoidance Control for Redundant Manipulators , 2019, Front. Neurorobot..

[16]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Mariano De Paula,et al.  Double Q-PID algorithm for mobile robot control , 2019, Expert Syst. Appl..

[19]  J. G. Ziegler,et al.  Optimum Settings for Automatic Controllers , 1942, Journal of Fluids Engineering.

[20]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[21]  Shuai Li,et al.  A Passivity-Based Approach for Kinematic Control of Manipulators With Constraints , 2020, IEEE Transactions on Industrial Informatics.

[22]  Kostas J. Kyriakopoulos,et al.  Switching Manipulator Control for Motion on Constrained Surfaces , 2011, J. Intell. Robotic Syst..

[23]  Yu Meng,et al.  Review and Comparison of Path Tracking Based on Model Predictive Control , 2019, Electronics.

[24]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[25]  Shuzhi Sam Ge,et al.  Adaptive Control of Robotic Manipulators With Unified Motion Constraints , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Kees M. van Hee,et al.  Application of Markov decision processes to search problems , 1995, Decis. Support Syst..

[28]  Sen Wang,et al.  Learning Mobile Manipulation through Deep Reinforcement Learning , 2020, Sensors.

[29]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[30]  Chao Shen,et al.  Trajectory Tracking Control of an Autonomous Underwater Vehicle Using Lyapunov-Based Model Predictive Control , 2018, IEEE Transactions on Industrial Electronics.

[31]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32]  Manfred Morari,et al.  Model predictive control: Theory and practice - A survey , 1989, Autom..

[33]  Marc Carreras,et al.  Two-step gradient-based reinforcement learning for underwater robotics behavior learning , 2013, Robotics Auton. Syst..

[34]  Edin Omerdic,et al.  Underwater manipulators: A review , 2018, Ocean Engineering.

[35]  Tsuneo Yoshikawa,et al.  Dynamic hybrid position/force control of robot manipulators--Description of hand constraints and calculation of joint driving force , 1986, IEEE Journal on Robotics and Automation.

[36]  Yvan Petillot,et al.  Coupled and Decoupled Force/Motion Controllers for an Underwater Vehicle-Manipulator System , 2018 .