Closed-form Solutions to a Subclass of Continuous Stochastic Games via Symbolic Dynamic Programming

Zero-sum stochastic games provide a formalism to study competitive sequential interactions between two agents with diametrically opposing goals and evolving state. A solution to such games with discrete state was presented by Littman (Littman, 1994). The continuous state version of this game remains unsolved. In many instances continuous state solutions require nonlinear optimisation, a problem for which closed-form solutions are generally unavailable. We present an exact closed-form solution to a subclass of zero-sum continuous stochastic games that can be solved as a parameterised linear program by utilising symbolic dynamic programming. This novel technique is applied to calculate exact solutions to a variety of zero-sum continuous state stochastic games.

[1]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[2]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[3]  Jesse Hoey,et al.  APRICODD: Approximate Policy Construction Using Decision Diagrams , 2000, NIPS.

[4]  Phillipp Bergmann Dynamic Programming Deterministic And Stochastic Models , 2016 .

[5]  Scott Sanner,et al.  Symbolic Dynamic Programming for Discrete and Continuous State MDPs , 2011, UAI.

[6]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[7]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[8]  Lihong Li,et al.  Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.

[9]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[10]  Ronen I. Brafman,et al.  A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains , 2009, J. Artif. Intell. Res..

[11]  Bruce Bueno de Mesquita,et al.  An Introduction to Game Theory , 2014 .

[12]  Enrico Macii,et al.  Algebric Decision Diagrams and Their Applications , 1997, ICCAD '93.

[13]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[14]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[15]  Marek Petrik,et al.  Robust Approximate Bilinear Programming for Value Function Approximation , 2011, J. Mach. Learn. Res..

[16]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[17]  Scott Sanner,et al.  Bounded Approximate Symbolic Dynamic Programming for Hybrid MDPs , 2013, UAI.

[18]  Kristin P. Bennett,et al.  Bilinear separation of two sets inn-space , 1993, Comput. Optim. Appl..

[19]  R. I. Bahar,et al.  Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[20]  S. Sanner,et al.  Symbolic Dynamic Programming for Continuous State and Action MDPs , 2012, AAAI.

[21]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.