Value-function reinforcement learning in Markov games

[1]  Manuela Veloso,et al.  An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .

[2]  Ron Sun,et al.  Rationality Assumptions and Optimality of Co-learning , 2000, PRIMA.

[3]  Sandip Sen,et al.  Evaluating concurrent reinforcement learners , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[4]  Michael P. Wellman,et al.  Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.

[5]  Michael H. Bowling,et al.  Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.

[6]  Csaba Szepesvári,et al.  A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.

[7]  Michael P. Wellman,et al.  Learning in dynamic noncooperative multiagent systems , 1999 .

[8]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[9]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[10]  V. Borkar Asynchronous Stochastic Approximations , 1998 .

[11]  Jerzy A. Filar,et al.  Markov Decision Processes: The Noncompetitive Case , 1997 .

[12]  Jerzy A. Filar,et al.  Competitive Markov decision processes : with 57 illustrations , 1997 .

[13]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[15]  Csaba Szepesvári,et al.  A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[16]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[17]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[20]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[21]  C. Watkins Learning from delayed rewards , 1989 .

[22]  Stef Tijs,et al.  Fictitious play applied to sequences of games and discounted stochastic games , 1982 .

[23]  R. Milner Mathematical Centre Tracts , 1976 .

[24]  R. Howard Dynamic Programming and Markov Processes , 1960 .

[25]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[26]  J. Neumann,et al.  Theory of Games and Economic Behavior. , 1945 .