Multiagent learning using a variable learning rate

[1]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[2]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[3]  Manuela Veloso,et al.  An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .

[4]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[5]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[6]  L. Shapley Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[7]  Manuela M. Veloso,et al.  Convergence of Gradient Dynamics with a Variable Learning Rate , 2001, ICML.

[8]  L. C. Thomas Stochastic Games with Finite State and Action Spaces , 1988 .

[9]  J. Robinson An Iterative Method of Solving a Game , 1951 .

[10]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[11]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[12]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[13]  Michael H. Bowling,et al.  Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.

[14]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[15]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1995 .

[17]  G. G. Stokes "J." , 1890 .

[18]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[19]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[20]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[21]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[22]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[23]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[24]  William T. B. Uther,et al.  Adversarial Reinforcement Learning , 2003 .

[25]  O. Mangasarian,et al.  Two-person nonzero-sum games and quadratic programming , 1964 .

[26]  Peter L. Bartlett,et al.  Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.

[27]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[28]  R. Howard Dynamic Programming and Markov Processes , 1960 .

[29]  H. Kuhn Classics in Game Theory , 1997 .

[30]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[31]  Hervé Reinhard,et al.  Differential equations: Foundations and applications , 1986 .

[32]  Michael P. Wellman,et al.  Learning in dynamic noncooperative multiagent systems , 1999 .

[33]  Avrim Blum,et al.  On-line Learning and the Metrical Task System Problem , 2000, COLT '97.

[34]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[35]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[36]  A. M. Fink Equilibrium in a stochastic $n$-person game , 1964 .