暂无分享,去创建一个
Archie C. Chapman | Nicholas R. Jennings | Enrique Munoz de Cote | Adam M. Sykulski | N. Jennings | E. M. D. Cote | A. Sykulski
[1] Bikramjit Banerjee,et al. Efficient learning of multi-step best response , 2005, AAMAS '05.
[2] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] Archie C. Chapman,et al. EA2: The Winning Strategy for the Inaugural Lemonade Stand Game Tournament , 2010, ECAI.
[5] Amy Greenwald,et al. An Algorithm for Computing Stochastically Stable Distributions with Applications to Multiagent Learning in Repeated Games , 2005, UAI.
[6] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[7] David C. Parkes,et al. Learning and Solving Many-Player Games through a Cluster-Based Representation , 2008, UAI.
[8] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[9] Michael L. Littman,et al. Social reward shaping in the prisoner's dilemma , 2008, AAMAS.
[10] Peter Stone,et al. Implicit Negotiation in Repeated Games , 2001, ATAL.
[11] Nicholas R. Jennings,et al. Planning against fictitious players in repeated normal form games , 2010, AAMAS.