Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-time Markov games. Furthermore, we show that optimal control does not only exist, but has a surprisingly simple structure: the optimal schedulers from our proofs are deterministic and timed positional, and the bounded time can be divided into a finite number of intervals, in which the optimal strategies are positional. That is, we demonstrate the existence of finite optimal control. Finally, we show that these pleasant properties of Markov decision processes extend to the more general class of continuous-time Markov games, and that both early and late schedulers show this behaviour.

[1]  P. Kakumanu Continuously Discounted Markov Decision Model with Countable State and Action Space , 1971 .

[2]  Lijun Zhang,et al.  Model Checking Interactive Markov Chains , 2010, TACAS.

[3]  Lijun Zhang,et al.  Time-bounded model checking of infinite-state continuous-time Markov chains , 2008, 2008 8th International Conference on Application of Concurrency to System Design.

[4]  Marco Ajmone Marsan,et al.  Modelling with Generalized Stochastic Petri Nets , 1995, PERV.

[5]  B. L. Miller Finite State Continuous Time Markov Decision Processes with a Finite Planning Horizon , 1968 .

[6]  Lijun Zhang,et al.  Time-Bounded Model Checking of Infinite-State Continuous-Time Markov Chains , 2009, Fundam. Informaticae.

[7]  Christel Baier,et al.  Efficient Computation of Time-Bounded Reachability Probabilities in Uniform Continuous-Time Markov Decision Processes , 2005, TACAS.

[8]  B. Finkbeiner Optimal Schedulers for Time-Bounded Reachability in CTMDPs , 2009 .

[9]  Jan Kretínský,et al.  Continuous-Time Stochastic Games with Time-Bounded Reachability , 2009, FSTTCS.

[10]  Martin R. Neuhäußer,et al.  Time-Bounded Reachability in Continuous-Time Markov Decision Processes ⋆ , 2009 .

[11]  Sven Schewe,et al.  Optimal Time-Abstract Schedulers for CTMDPs and Markov Games , 2010, QAPL.

[12]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[13]  William J. Stewart,et al.  Introduction to the numerical solution of Markov Chains , 1994 .

[14]  Lijun Zhang,et al.  Efficient Approximation of Optimal Control for Markov Games , 2010, ArXiv.

[15]  Lijun Zhang,et al.  Time-Bounded Reachability Probabilities in Continuous-Time Markov Decision Processes , 2010, 2010 Seventh International Conference on the Quantitative Evaluation of Systems.

[16]  O. Hernández-Lerma,et al.  Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates , 2005 .

[17]  Xianping Guo,et al.  Nonzero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates , 2003 .

[18]  Peter Buchholz,et al.  Numerical analysis of continuous time Markov decision processes over finite horizons , 2011, Comput. Oper. Res..

[19]  B. L. Miller Finite state continuous time Markov decision processes with an infinite planning horizon , 1968 .

[20]  Eugene A. Feinberg,et al.  Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach , 2004, Math. Oper. Res..

[21]  O. Hernández-Lerma,et al.  Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs , 2007, Advances in Applied Probability.

[22]  R. Ash,et al.  Probability and measure theory , 1999 .

[23]  Christel Baier,et al.  Approximate Symbolic Model Checking of Continuous-Time Markov Chains , 1999, CONCUR.

[24]  William H. Sanders,et al.  Reduced base model construction methods for stochastic activity networks , 1989, Proceedings of the Third International Workshop on Petri Nets and Performance Models, PNPM89.

[25]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[26]  Robert K. Brayton,et al.  Model-checking continuous-time Markov chains , 2000, TOCL.

[27]  Tapani Lehtonen,et al.  On the optimality of the shortest line discipline , 1984 .

[28]  Joost-Pieter Katoen,et al.  Delayed Nondeterminism in Continuous-Time Markov Decision Processes , 2009, FoSSaCS.

[29]  Nicolás Wolovick,et al.  A Characterization of Meaningful Schedulers for Continuous-Time Markov Decision Processes , 2006, FORMATS.