论文信息 - Reinforcement learning with function approximation for cooperative navigation tasks

Reinforcement learning with function approximation for cooperative navigation tasks

In this paper, we propose a reinforcement learning approach to address multi-robot cooperative navigation tasks in infinite settings. We propose an algorithm to simultaneously address the problems of learning and coordination in multi-robot problems. The proposed algorithm extends those existing in the literature, allowing to address simultaneous learning and coordination in problems with an infinite state-space. We also present the results obtained in several test scenarios featuring multi-robot navigation situations with partial observability.

Francisco S. Melo | M. Isabel Ribeiro | M. I. Ribeiro

[1] Matthijs T. J. Spaan,et al. An approach to noncommunicative multiagent coordination in continuous domains , 2002 .

[2] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[3] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[4] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[5] Nicholas V. Findler,et al. Social Structures and the Problem of Coordination in Intelligent Agent Societies , 2000 .

[6] Lennart Ljung,et al. Analysis of recursive stochastic algorithms , 1977 .

[7] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[8] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[9] R. Bellman. Dynamic programming. , 1957, Science.

[10] Dieter Fox,et al. Markov localization - a probabilistic framework for mobile robot localization and navigation , 1998 .

[11] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[12] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.

[13] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[14] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[15] Felix A. Fischer,et al. Hierarchical reinforcement learning in communication-mediated multiagent coordination , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[16] Francisco S. Melo,et al. Emerging coordination in infinite team Markov games , 2008, AAMAS.

[17] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .

[18] Francisco S. Melo,et al. LEARNING TO COORDINATE IN TOPOLOGICAL NAVIGATION TASKS , 2007 .

[19] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21] Francisco S. Melo,et al. Q -Learning with Linear Function Approximation , 2007, COLT.

[22] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[23] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[24] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.

[25] M. Pelletier. On the almost sure asymptotic behaviour of stochastic algorithms , 1998 .

[26] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .