Continuous-Time Markov Decisions based on Partial Exploration

We provide a framework for speeding up algorithms for time-bounded reachability analysis of continuous-time Markov decision processes. The principle is to find a small, but almost equivalent subsystem of the original system and only analyse the subsystem. Candidates for the subsystem are identified through simulations and iteratively enlarged until runs are represented in the subsystem with high enough probability. The framework is thus dual to that of abstraction refinement. We instantiate the framework in several ways with several traditional algorithms and experimentally confirm orders-of-magnitude speed ups in many cases.

[1]  Lijun Zhang,et al.  Efficient approximation of optimal control for continuous-time Markov games , 2010, Inf. Comput..

[2]  Krishnendu Chatterjee,et al.  Verification of Markov Decision Processes Using Learning Algorithms , 2014, ATVA.

[3]  Martin R. Neuhäußer,et al.  Model checking nondeterministic and randomly timed systems , 2010 .

[4]  Ezio Bartocci,et al.  Policy learning in continuous-time Markov decision processes using Gaussian Processes , 2017, Perform. Evaluation.

[5]  Holger Hermanns,et al.  Optimal Continuous Time Markov Decisions , 2015, ATVA.

[6]  Mark Timmer SCOOP: A Tool for SymboliC Optimisations of Probabilistic Processes , 2011, 2011 Eighth International Conference on Quantitative Evaluation of SysTems.

[7]  Joost-Pieter Katoen,et al.  Modelling, Reduction and Analysis of Markov Automata (extended version) , 2013, QEST.

[8]  Jan Kretínský,et al.  Continuous-Time Stochastic Games with Time-Bounded Reachability , 2009, FSTTCS.

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  Lijun Zhang,et al.  Model Checking Interactive Markov Chains , 2010, TACAS.

[11]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[12]  Greg N. Frederickson,et al.  Sequencing Tasks with Exponential Service Times to Minimize the Expected Flow Time or Makespan , 1981, JACM.

[13]  Joost-Pieter Katoen,et al.  Quantitative Timed Analysis of Interactive Markov Chains , 2012, NASA Formal Methods.

[14]  Holger Hermanns,et al.  Improving time bounded reachability computations in interactive Markov chains , 2015, Sci. Comput. Program..

[15]  Peter Buchholz,et al.  Numerical analysis of continuous time Markov decision processes over finite horizons , 2011, Comput. Oper. Res..

[16]  Robert K. Brayton,et al.  Verifying Continuous Time Markov Chains , 1996, CAV.

[17]  Krishnendu Chatterjee,et al.  Value Iteration for Long-Run Average Reward in Markov Decision Processes , 2017, CAV.

[18]  Sebastián Uchitel,et al.  Automated reliability estimation over partial systematic explorations , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[19]  Lijun Zhang,et al.  Time-Bounded Reachability Probabilities in Continuous-Time Markov Decision Processes , 2010, 2010 Seventh International Conference on the Quantitative Evaluation of Systems.

[20]  Claude Lefèvre,et al.  Optimal Control of a Birth and Death Epidemic Process , 1981, Oper. Res..

[21]  Joost-Pieter Katoen,et al.  On the use of model checking techniques for dependability evaluation , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[22]  Lijun Zhang,et al.  A Semantics for Every GSPN , 2013, Petri Nets.

[23]  Eugene A. Feinberg,et al.  Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach , 2004, Math. Oper. Res..

[24]  GhemawatSanjay,et al.  The Google file system , 2003 .

[25]  Christel Baier,et al.  Efficient Computation of Time-Bounded Reachability Probabilities in Uniform Continuous-Time Markov Decision Processes , 2005, TACAS.

[26]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[27]  Christel Baier,et al.  Model checking performability properties , 2002, Proceedings International Conference on Dependable Systems and Networks.

[28]  Geoffrey J. Gordon,et al.  Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[29]  L. Sennott Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[30]  Joost-Pieter Katoen,et al.  Efficient Modelling and Generation of Markov Automata , 2012, CONCUR.

[31]  Massoud Pedram,et al.  Stochastic modeling of a power-managed system-construction andoptimization , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..