论文信息 - Reinforcement Learning Approaches to Coordination in Cooperative Multi-agent Systems

Reinforcement Learning Approaches to Coordination in Cooperative Multi-agent Systems

We report on an investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems. Specifically, we focus on two novel approaches: one is based on a new action selection strategy for Q-learning [10], and the other is based on model estimation with a shared action-selection protocol. The new techniques are applicable to scenarios where mutual observation of actions is not possible. To date, reinforcement learning approaches for such independent agents did not guarantee convergence to the optimal joint action in scenarios with high miscoordination costs. We improve on previous results [2] by demonstrating empirically that our extension causes the agents to converge almost always to the optimal joint action even in these difficult cases.

[1] C. Watkins. Learning from delayed rewards , 1989 .

[2] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[3] Gerhard Weiss,et al. Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.

[4] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.

[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[6] Sandip Sen,et al. Individual learning of coordination knowledge , 1998, J. Exp. Theor. Artif. Intell..

[7] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[8] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[9] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[10] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[11] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.