Multiagent learning in the presence of agents with limitations
暂无分享,去创建一个
[1] Brett Browning,et al. ÜberSim: a multi-robot simulator for robot soccer , 2003, AAMAS '03.
[2] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[3] Hervé Reinhard,et al. Differential equations: Foundations and applications , 1986 .
[4] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[5] O. Mangasarian,et al. Two-person nonzero-sum games and quadratic programming , 1964 .
[6] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[7] R. Karp,et al. On Nonterminating Stochastic Games , 1966 .
[8] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[9] O. J. Vrieze,et al. Stochastic Games with Finite State and Action Spaces. , 1988 .
[10] J. Goodman. Note on Existence and Uniqueness of Equilibrium Points for Concave N-Person Games , 1965 .
[11] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[12] David Carmel,et al. Learning Models of Intelligent Agents , 1996, AAAI/IAAI, Vol. 1.
[13] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[14] M. Veloso,et al. Bounding the suboptimality of reusing subproblems , 1999, IJCAI 1999.
[15] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .
[16] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[17] Shie Mannor,et al. Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments , 2001, COLT/EuroCOLT.
[18] M. F.,et al. Bibliography , 1985, Experimental Gerontology.
[19] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[20] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[21] Manuela M. Veloso,et al. Existence of Multiagent Equilibria with Limited Agents , 2004, J. Artif. Intell. Res..
[22] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[23] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.
[24] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[25] S. Hart,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .
[26] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[27] Robert E. Tarjan,et al. Self-adjusting binary search trees , 1985, JACM.
[28] Tuomas Sandholm,et al. Bargaining with limited computation: Deliberation equilibrium , 2001, Artif. Intell..
[29] T. Speed,et al. Interview of Albert Tucker , 1975 .
[30] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[31] Manuela M. Veloso,et al. Real-time randomized path planning for robot navigation , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[32] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[33] S. Ross. GOOFSPIEL -- THE GAME OF PURE STRATEGY , 1971 .
[34] S. Hart,et al. Uncoupled Dynamics Cannot Lead to Nash Equilibrium ∗ , 2002 .
[35] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[36] T. Cormen,et al. Model-based Learning of Interaction Strategies in Multi-agent Systems , 1997 .
[37] Michael P. Wellman,et al. Learning in dynamic noncooperative multiagent systems , 1999 .
[38] Jörgen W. Weibull,et al. Evolutionary Game Theory , 1996 .
[39] Dov Samet,et al. Learning to play games in extensive form by valuation , 2001, J. Econ. Theory.
[40] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[41] J. Albus. A Theory of Cerebellar Function , 1971 .
[42] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[43] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[44] R. McKelvey,et al. Computation of equilibria in finite games , 1996 .
[45] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI , 1997, AI Mag..
[46] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[47] J. Wal. Discounted Markov games; successive approximation and stopping times , 1977 .
[48] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[49] M. Pollatschek,et al. Algorithms for Stochastic Games with Geometrical Interpretation , 1969 .
[50] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.
[51] Vincent Conitzer,et al. Complexity Results about Nash Equilibria , 2002, IJCAI.
[52] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[53] Gunes Ercal,et al. On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.
[54] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[55] Manuela M. Veloso,et al. Planning for Distributed Execution through Use of Probabilistic Opponent Models , 2002, AIPS.
[56] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[57] Bikramjit Banerjee,et al. Convergent Gradient Ascent in General-Sum Games , 2002, ECML.
[58] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[59] L. C. Thomas,et al. Stochastic Games with Finite State and Action Spaces , 1988 .
[60] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[61] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[62] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[63] Manuela Veloso,et al. Scalable Learning in Stochastic Games , 2002 .
[64] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[65] Manuela M. Veloso,et al. Convergence of Gradient Dynamics with a Variable Learning Rate , 2001, ICML.
[66] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[67] Ronen I. Brafman,et al. Efficient learning equilibrium , 2004, Artificial Intelligence.
[68] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[69] Shlomo Zilberstein,et al. Models of Bounded Rationality , 1995 .
[70] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.
[71] Manuela M. Veloso,et al. On Behavior Classification in Adversarial Environments , 2000, DARS.
[72] Itzhak Gilboa,et al. Bounded Versus Unbounded Rationality: The Tyranny of the Weak , 1989 .
[73] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[74] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[75] H. Kuhn. Classics in Game Theory , 1997 .
[76] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[77] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[78] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[79] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[80] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[81] G. Brown. SOME NOTES ON COMPUTATION OF GAMES SOLUTIONS , 1949 .
[82] Avrim Blum,et al. On-line Learning and the Metrical Task System Problem , 1997, COLT '97.
[83] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[84] Ian Frank,et al. Soccer Server: A Tool for Research on Multiagent Systems , 1998, Appl. Artif. Intell..
[85] Stuart J. Russell. Rationality and Intelligence , 1995, IJCAI.
[86] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[87] Manuela Veloso,et al. Tree based hierarchical reinforcement learning , 2002 .
[88] William T. B. Uther,et al. Adversarial Reinforcement Learning , 2003 .
[89] Peter Stone,et al. Leading Best-Response Strategies in Repeated Games , 2001, International Joint Conference on Artificial Intelligence.
[90] A. Rubinstein. Modeling Bounded Rationality , 1998 .
[91] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[92] Eitan Zemel,et al. Nash and correlated equilibria: Some complexity considerations , 1989 .
[93] Brett Browning,et al. Improbability filtering for rejecting false positives , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[94] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.
[95] Peter J. Jansen,et al. Using knowledge about the opponent in game-tree search , 1992 .