Partial-Observation Stochastic Games: How to Win When Belief Fails

We consider two-player stochastic games played on finite graphs with reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1), or positively (i.e., with positive probability), no matter the strategy of the second player. We classify such games according to the information and the power of randomization available to the players. On the basis of information, the game can be one-sided with either (a) player 1, or (b) player 2 having partial observation (and the other player has perfect observation), or two-sided with (c) both players having partial observation. On the basis of randomization, the players (a) may not be allowed to use randomization (pure strategies), or (b) may choose a probability distribution over actions but the actual random choice is external and not visible to the player (actions invisible), or (c) may use full randomization. Our main results for pure strategies are as follows. (1) For one-sided games with player 1 having partial observation we show that (in contrast to full randomized strategies) belief-based (subset-construction based) strategies are not sufficient, and we present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete. (2) For one-sided games with player 2 having partial observation we show that non-elementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3) We show that for the general (two-sided) case finite-memory strategies are sufficient for both positive and almost-sure winning, and at least non-elementary memory is required. We establish the equivalence of the almost-sure winning problems for pure strategies and for randomized strategies with actions invisible. Our equivalence result exhibits serious flaws in previous results of the literature: we show a non-elementary memory lower bound for almost-sure winning whereas an exponential upper bound was previously claimed.

[1]  Peter Lammich,et al.  Tree Automata , 2009, Arch. Formal Proofs.

[2]  Krishnendu Chatterjee,et al.  The Complexity of Partial-Observation Parity Games , 2010, LPAR.

[3]  Hugo Gimbert,et al.  Probabilistic Automata on Finite Words: Decidable and Undecidable Problems , 2010, ICALP.

[4]  Azaria Paz,et al.  Probabilistic automata , 2003 .

[5]  Mahesh Viswanathan,et al.  On the Expressiveness and Complexity of Randomization in Finite State Monitors , 2008, 2008 23rd Annual IEEE Symposium on Logic in Computer Science.

[6]  David L. Dill,et al.  Trace theory for automatic hierarchical verification of speed-independent circuits , 1989, ACM distinguished dissertations.

[7]  Wolfgang Thomas,et al.  Languages, Automata, and Logic , 1997, Handbook of Formal Languages.

[8]  Claire David,et al.  How do we remember the past in randomised strategies? , 2010, GANDALF.

[9]  Amir Pnueli,et al.  On the synthesis of a reactive module , 1989, POPL '89.

[10]  Christel Baier,et al.  Recognizing /spl omega/-regular languages with probabilistic automata , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[11]  Anne Condon,et al.  The Complexity of Stochastic Games , 1992, Inf. Comput..

[12]  Moshe Y. Vardi Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[13]  John H. Reif,et al.  Universal games of incomplete information , 1979, STOC.

[14]  Krishnendu Chatterjee,et al.  Partial-Observation Stochastic Games: How to Win When Belief Fails , 2011, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[15]  Aniello Murano,et al.  Pushdown module checking with imperfect information , 2007, Inf. Comput..

[16]  Mariëlle Stoelinga,et al.  An Introduction to Probabilistic Automata , 2002, Bull. EATCS.

[17]  Krishnendu Chatterjee,et al.  Algorithms for Omega-Regular Games with Imperfect Information , 2006, Log. Methods Comput. Sci..

[18]  Ludwig Staiger,et al.  Ω-languages , 1997 .

[19]  Christel Baier,et al.  Recurrence and Transience for Probabilistic Automata , 2009, FSTTCS.

[20]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[21]  Thomas A. Henzinger,et al.  Alternating-time temporal logic , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[22]  Nicolas Vieille,et al.  Stochastic Games with a Single Controller and Incomplete Information , 2002, SIAM J. Control. Optim..

[23]  Thomas A. Henzinger,et al.  Discrete-Time Control for Rectangular Hybrid Automata , 1997, Theor. Comput. Sci..

[24]  Thomas A. Henzinger,et al.  Concurrent reachability games , 2007, Theor. Comput. Sci..

[25]  Christel Baier,et al.  The Effect of Tossing Coins in Omega-Automata , 2009, CONCUR.

[26]  P. Ramadge,et al.  Supervisory control of a class of discrete event processes , 1987 .

[27]  Christel Baier,et al.  Probabilistic ω-automata , 2012, JACM.

[28]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[29]  Krishnendu Chatterjee,et al.  Randomness for Free , 2010, MFCS.

[30]  Azaria Paz,et al.  Introduction to probabilistic automata (Computer science and applied mathematics) , 1971 .

[31]  Bernd Finkbeiner,et al.  Abstraction Refinement for Games with Incomplete Information , 2008, FSTTCS.

[32]  Eran Yahav,et al.  Inferring Synchronization under Limited Observability , 2009, TACAS.

[33]  Hadas Kress-Gazit,et al.  Temporal-Logic-Based Reactive Mission and Motion Planning , 2009, IEEE Transactions on Robotics.

[34]  E. Allen Emerson,et al.  Tree automata, mu-calculus and determinacy , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[35]  Orna Kupfermant,et al.  Synthesis with Incomplete Informatio , 2000 .

[36]  L. Vietoris Theorie der endlichen und unendlichen Graphen , 1937 .

[37]  Nathalie Bertrand,et al.  Qualitative Determinacy and Decidability of Stochastic Games with Signals , 2009, 2009 24th Annual IEEE Symposium on Logic In Computer Science.

[38]  John H. Reif,et al.  A dynamic logic of multiprocessing with incomplete information , 1980, POPL '80.

[39]  Martín Abadi,et al.  Realizable and Unrealizable Specifications of Reactive Systems , 1989, ICALP.

[40]  Vincent Gripon,et al.  Qualitative Concurrent Stochastic Games with Imperfect Information , 2009, ICALP.

[41]  Krishnendu Chatterjee,et al.  Algorithms for Omega-Regular Games with Incomplete Information ∗ , 2006 .

[42]  Jean-François Raskin,et al.  A Lattice Theory for Solving Games of Imperfect Information , 2006, HSCC.

[43]  Krishnendu Chatterjee,et al.  Qualitative Analysis of Partially-Observable Markov Decision Processes , 2009, MFCS.

[44]  Jean-François Raskin,et al.  Antichain Algorithms for Finite Automata , 2010, TACAS.

[45]  Christel Baier,et al.  On Decision Problems for Probabilistic Büchi Automata , 2008, FoSSaCS.

[46]  Mahesh Viswanathan,et al.  Model Checking Concurrent Programs with Nondeterminism and Randomization , 2010, FSTTCS.

[47]  Thomas A. Henzinger,et al.  Interface automata , 2001, ESEC/FSE-9.

[48]  Dietmar Berwanger,et al.  On the Power of Imperfect Information , 2008, FSTTCS.

[49]  John H. Reif,et al.  The Complexity of Two-Player Games of Incomplete Information , 1984, J. Comput. Syst. Sci..

[50]  D. Koenig Theorie Der Endlichen Und Unendlichen Graphen , 1965 .

[51]  Mahesh Viswanathan,et al.  Power of Randomization in Automata on Infinite Strings , 2009, CONCUR.

[52]  T. Henzinger,et al.  Quantitative Synthesis for Concurrent Programs , 2011, CAV.

[53]  S. Sorin A First Course on Zero Sum Repeated Games , 2002 .