Of Cores: A Partial-Exploration Framework for Markov Decision Processes

We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain with high probability, and to avoid computation on the less relevant rest of the state space. Although we identify the core using simulations and statistical techniques, it allows for rigorous error bounds in the analysis. Consequently, we obtain efficient analysis algorithms based on partial exploration for various settings, including the challenging case of strongly connected systems.

[1]  Marta Z. Kwiatkowska,et al.  Performance analysis of probabilistic timed automata using digital clocks , 2003, Formal Methods Syst. Des..

[2]  Krishnendu Chatterjee,et al.  Faster and dynamic algorithms for maximal end-component decomposition and related graph problems in probabilistic verification , 2011, SODA '11.

[3]  Xian Wu,et al.  Variance reduced value iteration and faster algorithms for solving Markov decision processes , 2017, SODA.

[4]  Holger Hermanns,et al.  Continuous-Time Markov Decisions based on Partial Exploration , 2018, ATVA.

[5]  Jan Kretínský,et al.  Value Iteration for Simple Stochastic Games: Stopping Criterion and Learning Algorithm , 2018, CAV.

[6]  Ole Tange,et al.  GNU Parallel: The Command-Line Power Tool , 2011, login Usenix Mag..

[7]  Geoffrey J. Gordon,et al.  Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[8]  Krishnendu Chatterjee,et al.  An O(n2) time algorithm for alternating Büchi games , 2011, SODA.

[9]  Marta Z. Kwiatkowska,et al.  The PRISM Benchmark Suite , 2012, 2012 Ninth International Conference on Quantitative Evaluation of Systems.

[10]  Joost-Pieter Katoen,et al.  Sound Value Iteration , 2018, CAV.

[11]  Lijun Zhang,et al.  PASS: Abstraction Refinement for Infinite Probabilistic Models , 2010, TACAS.

[12]  Sebastian Junges,et al.  A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.

[13]  Krishnendu Chatterjee,et al.  Efficient and Dynamic Algorithms for Alternating Büchi Games and Maximal End-Component Decomposition , 2014, J. ACM.

[14]  Krishnendu Chatterjee,et al.  Stochastic invariants for probabilistic termination , 2016, POPL.

[15]  D. J. White,et al.  A Survey of Applications of Markov Decision Processes , 1993 .

[16]  Krishnendu Chatterjee,et al.  Verification of Markov Decision Processes Using Learning Algorithms , 2014, ATVA.

[17]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[18]  R. Bellman A Markovian Decision Process , 1957 .

[19]  Krishnendu Chatterjee,et al.  Value Iteration for Long-Run Average Reward in Markov Decision Processes , 2017, CAV.

[20]  Marta Z. Kwiatkowska,et al.  Probabilistic Model Checking of the IEEE 802.11 Wireless Local Area Network Protocol , 2002, PAPM-PROBMIV.

[21]  Mihalis Yannakakis,et al.  The complexity of probabilistic verification , 1995, JACM.

[22]  Christel Baier,et al.  Principles of model checking , 2008 .

[23]  Marta Z. Kwiatkowska,et al.  PRISM: Probabilistic Symbolic Model Checker , 2002, Computer Performance Evaluation / TOOLS.

[24]  Benjamin Monmege,et al.  Reachability in MDPs: Refining Convergence of Value Iteration , 2014, RP.

[25]  Kim G. Larsen,et al.  Reduction and Refinement Strategies for Probabilistic Analysis , 2002, PAPM-PROBMIV.

[26]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..