Nash Q-Learning for General-Sum Stochastic Games
暂无分享,去创建一个
[1] A. M. Fink,et al. Equilibrium in a stochastic $n$-person game , 1964 .
[2] John C. Harsanyi,et al. Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .
[3] C. Watkins. Learning from delayed rewards , 1989 .
[4] John Cubbin,et al. Optimality and Equilibria in Stochastic Games , 1994 .
[5] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[7] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[8] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[9] Kenji Fukumoto,et al. Multi-agent Reinforcement Learning: A Modular Approach , 1996 .
[10] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[11] R. McKelvey,et al. Computation of equilibria in finite games , 1996 .
[12] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[13] Francisco J. Vico,et al. Residual Q-Learning Applied to Visual Attention , 1996, ICML.
[14] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[15] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[16] Tucker Balch,et al. Learning Roles: Behavioral Diversity in Robot Teams , 1997 .
[17] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[18] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[20] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[21] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[22] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[23] Michael P. Wellman,et al. Learning in dynamic noncooperative multiagent systems , 1999 .
[24] Eric van Damme,et al. Non-Cooperative Games , 2000 .
[25] Ronen I. Brafman,et al. A near-optimal polynomial time algorithm for learning in certain classes of stochastic games , 2000, Artif. Intell..
[26] Jeffrey O. Kephart,et al. Pseudo-convergent Q-Learning by Competitive Pricebots , 2000, ICML.
[27] Marilyn A. Walker,et al. An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email , 2000, J. Artif. Intell. Res..
[28] Manu Sridharan,et al. Multi-agent Q-learning and regression trees for automated pricing decisions , 2000, Proceedings Fourth International Conference on MultiAgent Systems.
[29] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[30] Michael P. Wellman,et al. Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.
[31] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.
[32] Bikramjit Banerjee,et al. Fast Concurrent Reinforcement Learners , 2001, IJCAI.
[33] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[34] Daphne Koller,et al. Multi-Agent Influence Diagrams for Representing and Solving Games , 2001, IJCAI.
[35] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[36] S.H.G. ten Hagen. Continuous State Space Q-Learning for control of Nonlinear Systems , 2001 .
[37] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.
[38] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[39] Akira Hayashi,et al. A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.
[40] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[41] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .
[42] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[43] Craig Boutilier,et al. Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.
[44] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[45] Michael P. Wellman,et al. Conjectural Equilibrium in Multiagent Learning , 1998, Machine Learning.
[46] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[47] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[48] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.
[49] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[50] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.