Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods
暂无分享,去创建一个
[1] Russell Bent,et al. Modeling Humans as Reinforcement Learners: How to Predict Human Behavior in Multi-Stage Games , 2011 .
[2] H. Young,et al. Handbook of Game Theory with Economic Applications , 2015 .
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] 1 What Is Game Theory Trying to Accomplish ? , 1985 .
[5] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[6] Philip Wolfe,et al. Contributions to the theory of games , 1953 .
[7] Andrew McLennan,et al. Gambit: Software Tools for Game Theory , 2006 .
[8] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[9] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[10] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .
[11] A. Mas-Colell,et al. Microeconomic Theory , 1995 .
[12] D. Koller,et al. Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .
[13] Jean-Francois Richard,et al. Approximation of Nash equilibria in Bayesian games , 2008 .
[14] Brett Katzman,et al. A Two Stage Sequential Auction with Multi-Unit Demands☆☆☆ , 1999 .
[15] Tuomas Sandholm,et al. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.
[16] R. Weber. Multiple-Object Auctions , 1981 .
[17] Sergiu Hart,et al. Games in extensive and strategic forms , 1992 .
[18] Kevin Waugh,et al. Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.
[19] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.
[20] H. W. Kuhn,et al. 11. Extensive Games and the Problem of Information , 1953 .
[21] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[22] Flavio M. Menezes,et al. Synergies and price trends in sequential auctions , 1999 .
[23] Tuomas Sandholm,et al. Computing Equilibria in Multiplayer Stochastic Games of Imperfect Information , 2009, IJCAI.
[24] Michael P. Wellman,et al. Computing Best-Response Strategies in Infinite Games of Incomplete Information , 2004, UAI.
[25] Roger B. Myerson,et al. Game theory - Analysis of Conflict , 1991 .
[26] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[27] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[28] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[29] Victor Naroditskiy,et al. Using Iterated Best-Response to Find Bayes-Nash Equilibria in Auctions , 2007, AAAI.
[30] Michael P. Wellman,et al. Self-Confirming Price Prediction for Bidding in Simultaneous Ascending Auctions , 2005, UAI.
[31] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[32] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[33] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[34] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[35] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[36] Paul W. Goldberg,et al. The Complexity of Computing a Nash Equilibrium , 2009, SIAM J. Comput..
[37] Victor Lesser,et al. Approximately Solving Sequential Games With Incomplete Information , 2008 .
[38] Nicholas R. Jennings,et al. Computing pure Bayesian-Nash equilibria in games with finite actions and continuous types , 2013, Artif. Intell..
[39] Frans A. Oliehoek,et al. Best-response play in partially observable card games , 2005 .
[40] Hilbert J. Kappen,et al. On the Sample Complexity of Reinforcement Learning with a Generative Model , 2012, ICML.
[41] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[42] Amy Greenwald,et al. Approximating Equilibria in Sequential Auctions with Incomplete Information and Multi-Unit Demand , 2012, NIPS.