Asymmetric multiagent reinforcement learning
暂无分享,去创建一个
[1] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[2] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[3] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[4] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[5] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[9] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[10] Jonathan F. Bard,et al. Practical Bilevel Optimization: Algorithms and Applications , 1998 .
[11] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[12] Tuomas Sandholm,et al. Learning Near-Pareto-Optimal Conventions in Polynomial Time , 2003, NIPS.
[13] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[14] Daniel Kudenko,et al. Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems , 2005, Adaptive Agents and Multi-Agent Systems.
[15] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[16] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[17] Vincent Conitzer,et al. Complexity Results about Nash Equilibria , 2002, IJCAI.
[18] Kagan Tumer,et al. Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.
[19] Jeffrey O. Kephart,et al. Pricing in Agent Economies Using Multi-Agent Q-Learning , 2002, Autonomous Agents and Multi-Agent Systems.
[20] Manuela Veloso,et al. Scalable Learning in Stochastic Games , 2002 .
[21] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[22] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[23] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] Kagan Tumer,et al. Adaptivity in agent-based routing for data networks , 1999, AGENTS '00.
[26] Tom Lenaerts,et al. Learning to Reach the Pareto Optimal Nash Equilibrium as a Team , 2002, Australian Joint Conference on Artificial Intelligence.
[27] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[28] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[29] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[30] Ville Könönen,et al. Gradient Based Method for Symmetric and Asymmetric Multiagent Reinforcement Learning , 2003, IDEAL.
[31] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[32] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[33] D. Kudenko,et al. Improving on the reinforcement learning of coordination in cooperative multi-agent systems , 2002 .
[34] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .