Strategy Improvement for Concurrent Safety Games

We consider concurrent games played on graphs. At every round of the game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective: ``stay forever in a set F of states'', and its dual, the reachability objective, ``reach a set R of states''. We present in this paper a strategy improvement algorithm for computing the value of a concurrent safety game, that is, the maximal probability with which player 1 can enforce the safety objective. The algorithm yields a sequence of player-1 strategies which ensure probabilities of winning that converge monotonically to the value of the safety game. The significance of the result is twofold. First, while strategy improvement algorithms were known for Markov decision processes and turn-based games, as well as for concurrent reachability games, this is the first strategy improvement algorithm for concurrent safety games. Second, and most importantly, the improvement algorithm provides a way to approximate the value of a concurrent safety game from below (the known value-iteration algorithms approximate the value from above). Thus, when used together with value-iteration algorithms, or with strategy improvement algorithms for reachability games, our algorithm leads to the first practical algorithm for computing converging upper and lower bounds for the value of reachability and safety games.

[1]  J. Kemeny,et al.  Denumerable Markov chains , 1969 .

[2]  Rupak Majumdar,et al.  Quantitative solution of omega-regular games , 2004, J. Comput. Syst. Sci..

[3]  Kousha Etessami,et al.  Recursive Concurrent Stochastic Games , 2008, Log. Methods Comput. Sci..

[4]  Anne Condon,et al.  On Algorithms for Simple Stochastic Games , 1990, Advances In Computational Complexity Theory.

[5]  Florian Horn,et al.  Simple Stochastic Games with Few Random Vertices Are Easy to Solve , 2008, FoSSaCS.

[6]  Thomas A. Henzinger,et al.  Concurrent omega-regular games , 2000, Proceedings Fifteenth Annual IEEE Symposium on Logic in Computer Science (Cat. No.99CB36332).

[7]  Uri Zwick,et al.  The Complexity of Mean Payoff Games on Graphs , 1996, Theor. Comput. Sci..

[8]  S. Basu,et al.  Algorithms in real algebraic geometry , 2003 .

[9]  Mihalis Yannakakis,et al.  The complexity of probabilistic verification , 1995, JACM.

[10]  Thomas A. Henzinger,et al.  Concurrent reachability games , 2007, Theor. Comput. Sci..

[11]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[12]  Krishnendu Chatterjee,et al.  Strategy Improvement for Concurrent Reachability Games , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[13]  Donald A. Martin,et al.  The determinacy of Blackwell games , 1998, Journal of Symbolic Logic.

[14]  Anne Condon,et al.  The Complexity of Stochastic Games , 1992, Inf. Comput..

[15]  P. Kumar,et al.  Existence of Value and Randomized Strategies in Zero-Sum Discrete-Time Stochastic Dynamic Games , 1981 .

[16]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[17]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[18]  Andrea Bianco,et al.  Model Checking of Probabalistic and Nondeterministic Systems , 1995, FSTTCS.

[19]  Zohar Manna,et al.  Formal verification of probabilistic systems , 1997 .