Learning against opponents with bounded memory
暂无分享,去创建一个
[1] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[2] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[3] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[4] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[5] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[6] Gunes Ercal,et al. On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.
[7] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .
[8] Ivana Kruijff-Korbayová,et al. A Portfolio Approach to Algorithm Selection , 2003, IJCAI.
[9] Thomas G. Dietterich,et al. In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.
[10] Illah R. Nourbakhsh,et al. Learning Probabilistic Models for Decision-Theoretic Navigation of Mobile Robots , 2000, ICML.
[11] Yoav Shoham,et al. Run the GAMUT: a comprehensive approach to evaluating game-theoretic algorithms , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[12] Mihalis Yannakakis,et al. On complexity as bounded rationality (extended abstract) , 1994, STOC '94.
[13] W. Hamilton,et al. The Evolution of Cooperation , 1984 .
[14] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[15] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .
[16] Leslie Pack Kaelbling,et al. Playing is believing: The role of beliefs in multi-agent learning , 2001, NIPS.
[17] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[18] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .
[19] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[20] Zoubin Ghahramani,et al. Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.
[21] Peter Stone,et al. Implicit Negotiation in Repeated Games , 2001, ATAL.
[22] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[23] A. Neyman. Bounded complexity justifies cooperation in the finitely repeated prisoners' dilemma , 1985 .
[24] Nimrod Megiddo,et al. How to Combine Expert (and Novice) Advice when Actions Impact the Environment? , 2003, NIPS.
[25] W. Hoeffding. On the Distribution of the Number of Successes in Independent Trials , 1956 .
[26] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[27] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[28] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.
[29] Yoav Shoham,et al. A portfolio approach to algorithm select , 2003, IJCAI 2003.
[30] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[31] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.