论文信息 - HIGH-LEVEL CONTROL OF AUTONOMOUS ROBOTS USING A BEHAVIOR-BASED SCHEME AND REINFORCEMENT LEARNING

HIGH-LEVEL CONTROL OF AUTONOMOUS ROBOTS USING A BEHAVIOR-BASED SCHEME AND REINFORCEMENT LEARNING

Abstract This paper proposes a behavior-based scheme for high-level control of autonomous robots. Two main characteristics can be highlighted in the control scheme. Behavior coordination is done through a hybrid methodology, which takes in advantages of the robustness and modularity in competitive approaches, as well as optimized trajectories in cooperative ones. As a second feature, behavior state/action mapping is learnt by means of Reinforcement Learning (RL). A new continuous approach of the Q_learning algorithm, implemented with a multi-layer neural network, is used. The behavior-based scheme attempts to fulfill simple missions in which several behaviors/tasks compete for the vehicle's control. This paper is centered in the RL-based behaviors. In order to test the feasibility of the proposed Neural-Q_learning scheme, real experiments with the underwater robot ODIN in a target following behavior were done. Results showed the convergence of the behavior into an optimal state/action mapping. Discussion about the proposed approach is given, as well as an overall description of the high level control scheme.

Junku Yuh | Joan Batlle | Marc Carreras

[1] Sridhar Mahadevan,et al. Hierarchical Memory-Based Reinforcement Learning , 2000, NIPS.

[2] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[3] Ronald C. Arkin,et al. An Behavior-based Robotics , 1998 .

[4] Minoru Asada,et al. Continuous valued Q-learning for vision-guided behavior acquisition , 1999, Proceedings. 1999 IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI'99 (Cat. No.99TH8480).

[5] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[7] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[8] Alexander Zelinsky,et al. Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.

[9] Junku Yuh,et al. Development of the Omni Directional Intelligent Navigator , 1995, IEEE Robotics Autom. Mag..

[10] Tsukasa Ogasawara,et al. Continuous valued Q-learning method able to incrementally refine state space , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[11] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .

[12] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[13] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[14] Joan Batlle,et al. Hybrid coordination of reinforcement learning-based behaviors for AUV control , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).