Exponential Convergence in Undiscounted Continuous-Time Markov Decision Chains

In this paper, we analyze the asymptotic behaviour of the value function vt of an undiscounted continuous-time Markov decision chain. Both the state space and the action space are assumed to be finite. A new proof of the convergence of vt-tg is presented where g denotes the maximal expected average reward over an infinite time horizon. Moreover, it is shown that this convergence is exponential.

[1]  Daniel P. Heyman,et al.  Stochastic models in operations research , 1982 .

[2]  John Bather OPTIMAL STATIONARY POLICIES FOR DENUMERABLE MARKOV CHAINS IN CONTINUOUS TIME , 1976 .

[3]  Mark R. Lembersky On Maximal Rewards and $|varepsilon$-Optimal Policies in Continuous Time Markov Decision Chains , 1974 .

[4]  Awi Federgruen,et al.  A GENERAL MARKOV DECISION METHOD I: MODEL AND TECHNIQUES , 1977 .

[5]  A. Federgruen,et al.  A general markov decision method II: Applications , 1977, Advances in Applied Probability.

[6]  E. Coddington,et al.  Theory of Ordinary Differential Equations , 1955 .

[7]  W. Zijm Asymptotic expansions for dynamic programming recursions with general nonnegative matrices , 1987 .

[8]  W. Barry On the Iterative Method of Dynamic Programming on a Finite Space Discrete Time Markov Process , 1965 .

[9]  Bharat T. Doshi,et al.  Continuous time control of Markov processes on an arbitrary state space: Average return criterion , 1976 .

[10]  Paul J. Schweitzer,et al.  The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..

[11]  F. A. van der Duyn Schouten Markov decision processes with continuous time parameter , 1983 .

[12]  P. Kakumanu,et al.  Nondiscounted Continuous Time Markovian Decision Process with Countable State Space , 1972 .

[13]  B. L. Miller Finite state continuous time Markov decision processes with an infinite planning horizon , 1968 .

[14]  R. Bellman Dynamic programming. , 1957, Science.

[15]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[16]  B. L. Miller Finite State Continuous Time Markov Decision Processes with a Finite Planning Horizon , 1968 .

[17]  P. Schweitzer,et al.  Geometric convergence of value-iteration in multichain Markov decision problems , 1979, Advances in Applied Probability.

[18]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .