Different strokes in randomised strategies: Revisiting Kuhn's theorem under finite-memory assumptions

Two-player (antagonistic) games on (possibly stochastic) graphs are a prevalent model in theoretical computer science, notably as a framework for reactive synthesis. Optimal strategies may require randomisation when dealing with inherently probabilistic goals, balancing multiple objectives, or in contexts of partial information. There is no unique way to define randomised strategies. For instance, one can use so-called mixed strategies or behavioural ones. In the most general settings, these two classes do not share the same expressiveness. A seminal result in game theory — Kuhn’s theorem — asserts their equivalence in games of perfect recall. This result crucially relies on the possibility for strategies to use infinite memory, i.e., unlimited knowledge of all the past of a play. However, computer systems are finite in practice. Hence it is pertinent to restrict our attention to finite-memory strategies, defined as automata with outputs. Randomisation can be implemented in these in different ways: the initialisation, outputs or transitions can be randomised or deterministic respectively. Depending on which aspects are randomised, the expressiveness of the corresponding class of finite-memory strategies differs. In this work, we study two-player turn-based stochastic games and provide a complete taxonomy of the classes of finite-memory strategies obtained by varying which of the three aforementioned components are randomised. Our taxonomy holds both in settings of perfect and imperfect information.

[1]  A. Maitra,et al.  STOCHASTIC GAMES WITH BOREL PAYOFFS , 2003 .

[2]  T. Henzinger,et al.  Trading memory for randomness , 2004, First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings..

[3]  P. Kumar,et al.  Existence of Value and Randomized Strategies in Zero-Sum Discrete-Time Stochastic Dynamic Games , 1981 .

[4]  Patricia Bouyer,et al.  Characterizing Omega-Regularity through Finite-Memory Determinacy of Games on Infinite Graphs , 2021, ArXiv.

[5]  Véronique Bruyère,et al.  Energy mean-payoff games , 2019, CONCUR.

[6]  Mickael Randour,et al.  Life is Random, Time is Not: Markov Decision Processes with Window Objectives , 2019, CONCUR.

[7]  Lorenzo Clemente,et al.  Non-Zero Sum Games for Reactive Synthesis , 2015, LATA.

[8]  Hugo Gimbert,et al.  Games Where You Can Play Optimally Without Any Memory , 2005, CONCUR.

[9]  Krishnendu Chatterjee,et al.  Graph Games and Reactive Synthesis , 2018, Handbook of Model Checking.

[10]  A. Ehrenfeucht,et al.  Positional strategies for mean payoff games , 1979 .

[11]  Thomas A. Henzinger,et al.  Concurrent reachability games , 2007, Theor. Comput. Sci..

[12]  Véronique Bruyère,et al.  Window Parity Games: An Alternative Approach Toward Parity Games with Time Bounds (Full Version) , 2016, GandALF.

[13]  Mickael Randour,et al.  Automated synthesis of reliable and efficient systems through game theory: a case study , 2012, ArXiv.

[14]  Kim G. Larsen,et al.  Average-energy games , 2015, Acta Informatica.

[15]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[16]  Krishnendu Chatterjee,et al.  Markov Decision Processes with Multiple Long-Run Average Objectives , 2007, FSTTCS.

[17]  Patricia Bouyer,et al.  Arena-Independent Finite-Memory Determinacy in Stochastic Games , 2021, CONCUR.

[18]  Thomas Wilke,et al.  Automata logics, and infinite games: a guide to current research , 2002 .

[19]  Arno Pauly,et al.  Extending finite-memory determinacy by Boolean combination of winning conditions , 2018, FSTTCS.

[20]  Wieslaw Zielonka,et al.  Infinite Games on Finitely Coloured Graphs with Applications to Automata on Infinite Trees , 1998, Theor. Comput. Sci..

[21]  Mickael Randour,et al.  Percentile queries in multi-dimensional Markov decision processes , 2014, CAV.

[22]  Patricia Bouyer,et al.  Games Where You Can Play Optimally with Finite Memory , 2020, ArXiv.

[23]  Véronique Bruyère,et al.  Meet Your Expectations With Guarantees: Beyond Worst-Case Synthesis in Quantitative Games , 2013, STACS.

[24]  Claire David,et al.  How do we remember the past in randomised strategies? , 2010, GANDALF.

[25]  Krishnendu Chatterjee,et al.  Partial-Observation Stochastic Games: How to Win When Belief Fails , 2011, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[26]  E. Allen Emerson,et al.  The complexity of tree automata and logics of programs , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[27]  Reaching Your Goal Optimally by Playing at Random , 2020, CONCUR.

[28]  Mickael Randour,et al.  Simple Strategies in Multi-Objective MDPs , 2020, TACAS.

[29]  Krishnendu Chatterjee,et al.  Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[30]  Anne Condon,et al.  The Complexity of Stochastic Games , 1992, Inf. Comput..

[31]  Mickael Randour,et al.  Threshold Constraints with Guarantees for Parity Objectives in Markov Decision Processes , 2017, ICALP.

[32]  Krishnendu Chatterjee,et al.  Trading Infinite Memory for Uniform Randomness in Timed Games , 2008, HSCC.

[33]  Krishnendu Chatterjee,et al.  The complexity of multi-mean-payoff and multi-energy games , 2012, Inf. Comput..

[34]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[35]  Krishnendu Chatterjee,et al.  Randomness for Free , 2010, MFCS.