A behavior-based scheme using reinforcement learning for autonomous underwater vehicles

This paper presents a hybrid behavior-based scheme using reinforcement learning for high-level control of autonomous underwater vehicles (AUVs). Two main features of the presented approach are hybrid behavior coordination and semi on-line neural-Q/spl I.bar/learning (SONQL). Hybrid behavior coordination takes advantages of robustness and modularity in the competitive approach as well as efficient trajectories in the cooperative approach. SONQL, a new continuous approach of the Q/spl I.bar/learning algorithm with a multilayer neural network is used to learn behavior state/action mapping online. Experimental results show the feasibility of the presented approach for AUVs.

[1]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[2]  Joan Batlle,et al.  Recent trends in control architectures for autonomous underwater vehicles , 1999, Int. J. Syst. Sci..

[3]  Luis Moreno,et al.  Learning Emergent Tasks for an Autonomous Mobile Robot , 1994, IROS.

[4]  Alex M. Andrew,et al.  Intelligent Systems: Architecture, Design, and Control , 2002 .

[5]  Jürgen Schmidhuber,et al.  Reinforcement learning in partially observable mobile robot domains using unsupervised event extraction , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Pattie Maes,et al.  Situated agents can have goals , 1990, Robotics Auton. Syst..

[7]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[8]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[9]  Rodney A. Brooks,et al.  Learning to Coordinate Behaviors , 1990, AAAI.

[10]  Ronald C. Arkin Path Planning For A Vision-Based Autonomous Robot , 1987, Other Conferences.

[11]  Ronald C. Arkin,et al.  Motor Schema — Based Mobile Robot Navigation , 1989, Int. J. Robotics Res..

[12]  Junku Yuh,et al.  On-Board Sensor-Based Adaptive Control of Small UUVS in Very Shallow Water , 1998 .

[13]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[14]  Marios M. Polycarpou,et al.  An analytical framework for local feedforward networks , 1998, IEEE Trans. Neural Networks.

[15]  Ronald C. Arkin,et al.  Adaptive multi-robot behavior via learning momentum , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[16]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[17]  Marc Carreras,et al.  An Overview on Behaviour-Based Methods for AUV Control , 2000 .

[18]  Manuela M. Veloso,et al.  Tree Based Discretization for Continuous State Space Reinforcement Learning , 1998, AAAI/IAAI.

[19]  Maria L. Gini,et al.  Measuring the Effectiveness of Reinforcement Learning for Behavior-Based Robots , 1997, Adapt. Behav..

[20]  András Lörincz,et al.  Module-Based Reinforcement Learning: Experiments with a Real Robot , 1998, Machine Learning.

[21]  Claude F. Touzet,et al.  Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..

[22]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[23]  Junku Yuh,et al.  Development of the Omni Directional Intelligent Navigator , 1995, IEEE Robotics Autom. Mag..

[24]  Leslie Pack Kaelbling,et al.  Making Reinforcement Learning Work on Real Robots , 2002 .

[25]  Chris Gaskett,et al.  Q-Learning for Robot Control , 2002 .

[26]  Andres El-Fakdi,et al.  On the Identification of Non Linear Models of Unmanned Underwater Vehicles , 2003 .

[27]  Pere Ridao,et al.  Vision-based localization of an underwater robot in a structured environment , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[28]  Mark D. Pendrith,et al.  RL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning , 1998, ICML.

[29]  Junku Yuh,et al.  On AUV control architecture , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[30]  Ronald C. Arkin,et al.  Robot behavioral selection using q-learning , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Maja Matijasevic,et al.  Control architectures for autonomous underwater vehicles , 1997 .

[32]  Joan Batlle,et al.  Hybrid coordination of reinforcement learning-based behaviors for AUV control , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[33]  A. Meystel,et al.  Intelligent Systems , 2001 .

[34]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[35]  Minoru Asada,et al.  Vision-guided behavior acquisition of a mobile robot by multi-layered reinforcement learning , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[36]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37]  M. Carreras,et al.  An overview of Behavioural-based Robotics with simulated implementations on an Underwater Vehicle , 2000 .

[38]  Andres El-Fakdi,et al.  On the identification of non-linear models of unmanned underwater vehicles , 2004 .