Planning for Stochastic Games with Co-Safe Objectives

We consider planning problems for stochastic games with objectives specified by a branching-time logic, called probabilistic computation tree logic (PCTL). This problem has been shown to be undecidable if strategies with perfect recall, i.e., history-dependent, are considered. In this paper, we show that, if restricted to co-safe properties, a subset of PCTL properties capable to specify a wide range of properties in practice including reachability ones, the problem turns to be decidable, even when the class of general strategies is considered. We also give an algorithm for solving robust stochastic planning, where a winning strategy is tolerant to some perturbations of probabilities in the model. Our result indicates that satisfiability of co-safe PCTL is decidable as well.

[1]  Sape Mullender,et al.  Distributed systems , 1989 .

[2]  Krishnendu Chatterjee,et al.  Value Iteration , 2008, 25 Years of Model Checking.

[3]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[4]  A. Tarski A Decision Method for Elementary Algebra and Geometry , 2023 .

[5]  Nick Hawes,et al.  Optimal and dynamic planning for Markov decision processes with co-safe LTL specifications , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Cheng Luo,et al.  A logic of probabilistic knowledge and strategy , 2013, AAMAS.

[7]  K. Wakuta,et al.  Solution procedures for multi-objective markov decision processes , 1998 .

[8]  Lijun Zhang,et al.  Probably safe or live , 2014, CSL-LICS.

[9]  L. Goddard,et al.  Operations Research (OR) , 2007 .

[10]  Jian Lu,et al.  Probabilistic Alternating-time Temporal Logic and Model Checking Algorithm , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[11]  Lars Grunske,et al.  Specification patterns for probabilistic quality properties , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[12]  Bengt Jonsson,et al.  A logic for reasoning about time and reliability , 1990, Formal Aspects of Computing.

[13]  Shie Mannor,et al.  Percentile optimization in uncertain Markov decision processes with application to efficient exploration , 2007, ICML '07.

[14]  Ioannis P. Vlahavas,et al.  Multiobjective heuristic state-space planning , 2003, Artif. Intell..

[15]  Mihalis Yannakakis,et al.  The complexity of probabilistic verification , 1995, JACM.

[16]  Claudia V. Goldman,et al.  Fault-Tolerant Planning under Uncertainty , 2013, IJCAI.

[17]  Nathalie Bertrand,et al.  Bounded Satisfiability for PCTL , 2012, CSL.

[18]  Kaile Su,et al.  Probabilistic Alternating-Time Temporal Logic of Incomplete Information and Synchronous Perfect Recall , 2012, AAAI.

[19]  Shlomo Zilberstein,et al.  Multi-Objective MDPs with Conditional Lexicographic Reward Preferences , 2015, AAAI.

[20]  Christel Baier,et al.  Controller Synthesis for Probabilistic Systems , 2004, IFIP TCS.

[21]  Antonín Kucera,et al.  On the Controller Synthesis for Finite-State Markov Decision Processes , 2005, Fundam. Informaticae.

[22]  Christel Baier,et al.  Principles of model checking , 2008 .

[23]  Christel Baier,et al.  Stochastic game logic , 2007, Fourth International Conference on the Quantitative Evaluation of Systems (QEST 2007).

[24]  Costas S. Iliopoulos,et al.  Formal Aspects of Computing , 2013 .

[25]  Laurent El Ghaoui,et al.  Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..

[26]  Leslie Lamport,et al.  Proving the Correctness of Multiprocess Programs , 1977, IEEE Transactions on Software Engineering.

[27]  M.A. Wiering,et al.  Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[28]  Jan Kretínský,et al.  The Satisfiability Problem for Probabilistic CTL , 2008, 2008 23rd Annual IEEE Symposium on Logic in Computer Science.

[29]  Marc Schoenauer,et al.  Pareto-Based Multiobjective AI Planning , 2013, IJCAI.

[30]  R. Cooke Real and Complex Analysis , 2011 .

[31]  Christel Baier,et al.  Principles of Model Checking (Representation and Mind Series) , 2008 .

[32]  H. E. Kuhn,et al.  BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, , 2007 .

[33]  Florent Teichteil-Königsbuch Path-Constrained Markov Decision Processes: bridging the gap between probabilistic model-checking and decision-theoretic planning , 2012, ECAI.

[34]  Shie Mannor,et al.  Probabilistic Goal Markov Decision Processes , 2011, IJCAI.

[35]  Tomás Brázdil,et al.  Stochastic games with branching-time winning objectives , 2006, 21st Annual IEEE Symposium on Logic in Computer Science (LICS'06).

[36]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[37]  Leslie Lamport,et al.  Distributed Systems: Methods and Tools for Specification, An Advanced Course, April 3-12, 1984 and April 16-25, 1985, Munich, Germany , 1985, Advanced Course: Distributed Systems.