LEARNING FROM DELAYED REWARDS USING INFLUENCE VALUES APPLIED TO COORDINATION IN MULTI-AGENT SYSTEMS
暂无分享,去创建一个
[1] Fabrice R. Noreils,et al. Toward a Robot Architecture Integrating Cooperation between Mobile Robots: Application to Indoor Environment , 1993, Int. J. Robotics Res..
[2] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[3] Rachid Alami,et al. Robots that Cooperatively Enhance Their Plans , 2000, DARS.
[4] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[5] W. L. Johnson,et al. Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems , 2002 .
[6] Akira Hayashi,et al. A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.
[7] Kagan Tumer,et al. Learning sequences of actions in collectives of autonomous agents , 2002, AAMAS '02.
[8] Sandip Sen,et al. Towards a pareto-optimal solution in general-sum games , 2003, AAMAS '03.
[9] V. Kononen,et al. Asymmetric multiagent reinforcement learning , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..
[10] Craig Boutilier,et al. Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.
[11] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[12] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[13] Ville Könönen,et al. Asymmetric multiagent reinforcement learning , 2003, Web Intell. Agent Syst..