Designing Learning Algorithms over the Sequence Form of an Extensive-Form Game
暂无分享,去创建一个
[2] J. Cross. A Stochastic Learning Model of Economic Behavior , 1973 .
[3] Marc Lanctot,et al. Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling , 2011, J. Artif. Intell. Res..
[4] Drew Fudenberg,et al. Game theory (3. pr.) , 1991 .
[5] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[6] Michael H. Bowling,et al. No-Regret Learning in Extensive-Form Games with Imperfect Recall , 2012, ICML.
[7] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[8] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[9] Karl Tuyls,et al. Evolutionary Dynamics of Multi-Agent Learning: A Survey , 2015, J. Artif. Intell. Res..
[10] Michael L. Littman,et al. Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration , 2010, ICML.
[11] Duane Szafron,et al. Using counterfactual regret minimization to create competitive multiplayer poker agents , 2010, AAMAS 2010.
[12] B. Stengel,et al. Efficient Computation of Behavior Strategies , 1996 .
[13] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[14] Marcello Restelli,et al. Efficient Evolutionary Dynamics with Extensive-Form Games , 2013, AAAI.
[15] Dries Vermeulen,et al. The reduced form of a game , 1998, Eur. J. Oper. Res..
[16] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[17] Karl Tuyls,et al. A common gradient in multi-agent reinforcement learning , 2012, AAMAS.
[18] Richard Gibson,et al. Regret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents , 2013, ArXiv.
[19] Marcello Restelli,et al. Evolutionary Dynamics of Q-Learning over the Sequence Form , 2014, AAAI.
[20] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[21] Ryszard Kowalczyk,et al. Dynamic analysis of multiagent Q-learning with ε-greedy exploration , 2009, ICML '09.