论文信息 - All learning is Local: Multi-agent Learning in Global Reward Games

All learning is Local: Multi-agent Learning in Global Reward Games

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms. We provide a simple and efficient algorithm that in part uses a linear system to model the world from a single agent's limited perspective, and takes advantage of Kalman filtering to allow an agent to construct a good training signal and learn an effective policy.

[1] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[2] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[3] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[5] Dit-Yan Yeung,et al. Hidden-mode Markov decision processes , 1999, IJCAI 1999.

[6] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[7] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.

[8] Shie Mannor,et al. Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments , 2001, COLT/EuroCOLT.

[9] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .

[10] András Lörincz,et al. MDPs: Learning in Varying Environments , 2003, J. Mach. Learn. Res..

[11] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.