An hybrid methodology for RL-based behavior coordination in a target following mission with an AUV

Proposes a behavior-based scheme for high-level control of autonomous underwater vehicles (AUVs). Two main characteristics can be highlighted in the control scheme. Behavior coordination is done through a hybrid methodology, which takes in advantages of the robustness and modularity in competitive approaches, as well as optimized trajectories in cooperative ones. As a second feature, behavior state/action mapping is learnt by means of reinforcement learning (RL). A continuous Q-learning algorithm, implemented with a feed-forward neural network, is used. The behavior-based scheme attempts to fulfill simple missions in which several behaviors/tasks compete for the vehicle's control. The paper shows its feasibility with a target following mission designed to be carried out in a pool with the AUV ODIN. In the paper, simulation results are shown demonstrating the good performance of the hybrid method on behavior coordination as well as the convergence of the RL-based behaviors.

[1]  Minoru Asada,et al.  Continuous valued Q-learning for vision-guided behavior acquisition , 1999, Proceedings. 1999 IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI'99 (Cat. No.99TH8480).

[2]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[3]  M. Carreras,et al.  An overview of Behavioural-based Robotics with simulated implementations on an Underwater Vehicle , 2000 .

[4]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[5]  Junku Yuh,et al.  An Adaptive and Learning Control System for Underwater Robots , 1996 .

[6]  Luis Moreno,et al.  Learning Emergent Tasks for an Autonomous Mobile Robot , 1994, IROS.

[7]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[8]  Pattie Maes,et al.  Situated agents can have goals , 1990, Robotics Auton. Syst..

[9]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[10]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[11]  Rodney A. Brooks,et al.  Learning to Coordinate Behaviors , 1990, AAAI.

[12]  Claude F. Touzet,et al.  Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..

[13]  Marc Carreras,et al.  An Overview on Behaviour-Based Methods for AUV Control , 2000 .

[14]  Ronald C. Arkin,et al.  Motor Schema — Based Mobile Robot Navigation , 1989, Int. J. Robotics Res..

[15]  Alexander Zelinsky,et al.  Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.

[16]  Junku Yuh,et al.  Development of the Omni Directional Intelligent Navigator , 1995, IEEE Robotics Autom. Mag..

[17]  Maria L. Gini,et al.  Measuring the Effectiveness of Reinforcement Learning for Behavior-Based Robots , 1997, Adapt. Behav..