A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games with Perfect Information

In this paper, we consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G=(V=VB∪VW∪VR, E), with local rewards $r: E \to {\mathbb R}$, and three types of vertices: black VB, white VW, and random VR. The game is played by two players, White and Black: When the play is at a white (black) vertex v, White (Black) selects an outgoing arc (v,u). When the play is at a random vertex v, a vertex u is picked with the given probability p(v,u). In all cases, Black pays White the value r(v,u). The play continues forever, and White aims to maximize (Black aims to minimize) the limiting mean (that is, average) payoff. It was recently shown in [7] that BWR-games are polynomially equivalent with the classical Gillette games, which include many well-known subclasses, such as cyclic games, simple stochastic games (SSG′s), stochastic parity games, and Markov decision processes. In this paper, we give a new algorithm for solving BWR-games in the ergodic case, that is when the optimal values do not depend on the initial position. Our algorithm solves a BWR-game by reducing it, using a potential transformation, to a canonical form in which the optimal strategies of both players and the value for every initial position are obvious, since a locally optimal move in it is optimal in the whole game. We show that this algorithm is pseudo-polynomial when the number of random nodes is constant. We also provide an almost matching lower bound on its running time, and show that this bound holds for a wider class of algorithms. Let us add that the general (non-ergodic) case is at least as hard as SSG′s, for which no pseudo-polynomial algorithm is known.

[1]  Marcin Jurdzinski,et al.  A Discrete Strategy Improvement Algorithm for Solving Parity Games , 2000, CAV.

[2]  Peter Bro Miltersen,et al.  On the computational complexity of solving stochastic mean-payoff games , 2008, ArXiv.

[3]  Ian Stark,et al.  Free-Algebra Models for the pi-Calculus , 2005, FoSSaCS.

[4]  Henrik Björklund,et al.  A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games , 2007, Discrete Applied Mathematics.

[5]  S. Lippman,et al.  Stochastic Games with Perfect Information and Time Average Payoff , 1969 .

[6]  S. Vajda,et al.  Contribution to the Theory of Games , 1951 .

[7]  Anne Condon,et al.  The Complexity of Stochastic Games , 1992, Inf. Comput..

[8]  Vladimir Gurvich,et al.  Every stochastic game with perfect information admits a canonical form , 2009 .

[9]  H. Moulin Prolongement des jeux à deux joueurs de somme nulle. Une théorie abstraite des duels , 1976 .

[10]  Emmanuel Beffara,et al.  Adapting Gurvich-Karzanov-Khachiyan's Algorithm for Parity Games: Implementation and Experimentation , 2001 .

[11]  J. A. Bather Markovian Decision Processes , 1971 .

[12]  H. Moulin Extensions of two person zero sum games , 1976 .

[13]  Alexander V. Karzanov,et al.  Cyclical games with prohibitions , 1993, Math. Program..

[14]  Nicolai N. Pisaruk,et al.  Mean Cost Cyclical Games , 1999, Math. Oper. Res..

[15]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[16]  Richard M. Karp,et al.  A characterization of the minimum cycle mean in a digraph , 1978, Discret. Math..

[17]  Stéphane Gaubert,et al.  How to solve large scale deterministic games with mean payoff by policy iteration , 2006, valuetools '06.

[18]  Marcin Jurdziński,et al.  Deciding the Winner in Parity Games is in UP \cap co-Up , 1998, Inf. Process. Lett..

[19]  Krishnendu Chatterjee,et al.  Quantitative stochastic parity games , 2004, SODA '04.

[20]  Florian Horn,et al.  Simple Stochastic Games with Few Random Vertices Are Easy to Solve , 2008, FoSSaCS.

[21]  Uri Zwick,et al.  The Complexity of Mean Payoff Games on Graphs , 1996, Theor. Comput. Sci..

[22]  S. Vorobyov,et al.  Is Randomized Gurvich-Karzanov-Khachiyan's Algorithm for Parity Games Polynomial? , 2001 .

[23]  Vladimir Gurvich,et al.  Why Chess and Backgammon can be solved in pure positional uniformly optimal strategies , 2009 .

[24]  Vladimir Gurvich,et al.  Nash-solvable bidirected cyclic two-person game forms , 2008 .

[25]  A. Karzanov,et al.  Cyclic games and an algorithm to find minimax cycle means in directed graphs , 1990 .

[26]  Dean Gillette,et al.  9. STOCHASTIC GAMES WITH ZERO STOP PROBABILITIES , 1958 .

[27]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[28]  Henrik Björklund,et al.  Combinatorial structure and randomized subexponential algorithms for infinite games , 2005, Theor. Comput. Sci..

[29]  A. Ehrenfeucht,et al.  Positional strategies for mean payoff games , 1979 .

[30]  Krishnendu Chatterjee,et al.  Reduction of stochastic parity to stochastic mean-payoff games , 2008, Inf. Process. Lett..

[31]  R. Karp,et al.  On Nonterminating Stochastic Games , 1966 .

[32]  Henrik Björklund,et al.  A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games , 2007, Discret. Appl. Math..

[33]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[34]  Oliver Friedmann,et al.  An Exponential Lower Bound for the Parity Game Strategy Improvement Algorithm as We Know it , 2009, 2009 24th Annual IEEE Symposium on Logic In Computer Science.

[35]  M. Paterson,et al.  A deterministic subexponential algorithm for solving parity games , 2006, SODA 2006.

[36]  T. Gallai,et al.  Maximum-Minimum Sätze über Graphen , 1958 .

[37]  Sergei G. Vorobyov,et al.  Cyclic games and linear programming , 2008, Discret. Appl. Math..

[38]  Nir Halman,et al.  Simple Stochastic Games, Parity Games, Mean Payoff Games and Discounted Payoff Games Are All LP-Type Problems , 2007, Algorithmica.

[39]  Kurt Mehlhorn,et al.  Certifying algorithms for recognizing interval graphs and permutation graphs , 2003, SODA '03.

[40]  Peter Bro Miltersen,et al.  The Complexity of Solving Stochastic Games on Graphs , 2009, ISAAC.