Discretization of Continuous Action Spaces in Extensive-Form Games

Extensive-form games are a powerful tool for modeling a large range of multiagent scenarios. However, most solution algorithms require discrete, finite games. In contrast, many real-world domains require modeling with continuous action spaces. This is usually handled by heuristically discretizing the continuous action space without solution quality bounds. In this paper we address this issue. Leveraging recent results on abstraction solution quality, we develop the first framework for providing bounds on solution quality for discretization of continuous action spaces in extensive-form games. For games where the error is Lipschitz-continuous in the distance of a continuous point to its nearest discrete point, we show that a uniform discretization of the space is optimal. When the error is monotonically increasing in distance to nearest discrete point, we develop an integer program for finding the optimal discretization when the error is described by piecewise linear functions. This result can further be used to approximate optimal solutions to general monotonic error functions. Finally we discuss how our theory applies to several practical problems for which no solution quality bounds could be derived before.

[1]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[2]  Tuomas Sandholm,et al.  Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping , 2013, IJCAI.

[3]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[4]  Nicola Basilico,et al.  Automated Abstractions for Patrolling Security Games , 2011, AAAI.

[5]  Michael H. Bowling,et al.  Tractable Objectives for Robust Policy Optimization , 2012, NIPS.

[6]  Tuomas Sandholm,et al.  Steering Evolution Strategically: Computational Game Theory and Opponent Exploitation for Treatment Planning, Drug Design, and Synthetic Biology , 2015, AAAI.

[7]  François Pays,et al.  An Interior Point Approach to Large Games of Incomplete Information , 2014, AAAI 2014.

[8]  Tuomas Sandholm,et al.  Extensive-form game abstraction with bounds , 2014, EC.

[9]  Michael H. Bowling,et al.  Evaluating state-space abstractions in extensive-form games , 2013, AAMAS.

[10]  Gerald Tesauro,et al.  Playing repeated Stackelberg games with unknown opponents , 2012, AAMAS.

[11]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[12]  Milind Tambe,et al.  A unified method for handling discrete and continuous uncertainty in Bayesian Stackelberg games , 2012, AAMAS.

[13]  Tuomas Sandholm,et al.  The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..

[14]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[15]  Tuomas Sandholm,et al.  Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[16]  Tuomas Sandholm,et al.  Expectation-Based Versus Potential-Aware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker , 2008, AAAI.

[17]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[18]  Troels Bjerre Lund,et al.  A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs , 2008, AAMAS.

[19]  Kevin Waugh,et al.  Accelerating Best Response Calculation in Large Extensive Games , 2011, IJCAI.

[20]  Aranyak Mehta,et al.  Playing large games using simple strategies , 2003, EC '03.

[21]  Kevin Leyton-Brown,et al.  Polynomial-time computation of exact correlated equilibrium in compact games , 2010, EC '11.

[22]  Michael H. Bowling,et al.  No-Regret Learning in Extensive-Form Games with Imperfect Recall , 2012, ICML.

[23]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[24]  Milind Tambe,et al.  Approximation methods for infinite Bayesian Stackelberg games: modeling distributional payoff uncertainty , 2011, AAMAS.

[25]  Michael L. Littman,et al.  Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[26]  Michael H. Bowling,et al.  Finding Optimal Abstract Strategies in Extensive-Form Games , 2012, AAAI.

[27]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[28]  Tuomas Sandholm,et al.  Regret Transfer and Parameter Optimization , 2014, AAAI.

[29]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[30]  Tuomas Sandholm,et al.  Extensive-Form Game Imperfect-Recall Abstractions With Bounds , 2014, ArXiv.

[31]  Kevin Waugh,et al.  Abstraction pathologies in extensive games , 2009, AAMAS.

[32]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[33]  Peter Stone,et al.  A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.

[34]  Tuomas Sandholm,et al.  Power napping with loud neighbors: optimal energy-constrained jamming and anti-jamming , 2014, WiSec '14.

[35]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[36]  Tuomas Sandholm,et al.  Lossy stochastic game abstraction with bounds , 2012, EC '12.

[37]  Duane Szafron,et al.  Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games , 2012, AAAI.

[38]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[39]  Duane Szafron,et al.  Automated Action Abstraction of Imperfect Information Extensive-Form Games , 2011, AAAI.

[40]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[41]  Christopher Archibald,et al.  Modeling billiards games , 2009, AAMAS.