论文信息 - Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion - 字舞流文

Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion

We give conditions for the existence of average optimal policies for continuous-time controlled Markov chains with a denumerable state-space and Borel action sets. The transition rates are allowed to be unbounded, and the reward/cost rates may have neither upper nor lower bounds. In the spirit of the "drift and monotonicity" conditions for continuous-time Markov processes, we propose a new set of conditions on the controlled process' primitive data under which the existence of optimal (deterministic) stationary policies in the class of randomized Markov policies is proved using the extended generator approach instead of Kolmogorov's forward equation used in the previous literature, and under which the convergence of a policy iteration method is also shown. Moreover, we use a controlled queueing system to show that all of our conditions are satisfied, whereas those in the previous literature fail to hold.

Xianping Guo | Onésimo Hernández-Lerma | O. Hernández-Lerma | Xianping Guo

[1] P. Kakumanu,et al. Continuous time Markovian decision processes average return criterion , 1975 .

[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3] L. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[4] Xianping Guo,et al. A note on optimality conditions for continuous-time Markov decision processes with average cost criterion , 2001, IEEE Trans. Autom. Control..

[5] P. Kakumanu,et al. Nondiscounted Continuous Time Markovian Decision Process with Countable State Space , 1972 .

[6] S. Meyn,et al. Computable exponential convergence rates for stochastically ordered Markov processes , 1996 .

[7] E. Fainberg,et al. On Homogeneous Markov Models with Continuous Time and Finite or Countable State Space , 1979 .

[8] Zheng Shaohui. Continuous time Markov decision programming with average reward criterion and unbounded reward rate , 1991 .

[9] Mark H. Davis. Markov Models and Optimization , 1995 .

[10] Weiping Zhu,et al. Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion , 2002, The ANZIAM Journal.

[11] Xi-Ren Cao,et al. The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes , 1998, Discret. Event Dyn. Syst..

[12] Onésimo Hernández-Lerma,et al. Controlled Markov Processes , 1965 .

[13] William Feller,et al. On the integro-differential equations of purely discontinuous Markoff processes , 1940 .

[14] L. Fisher,et al. On Recurrent Denumerable Decision Processes , 1968 .

[15] Kai Lai Chung,et al. Markov Chains with Stationary Transition Probabilities , 1961 .

[16] O. Hernández-Lerma,et al. Further topics on discrete-time Markov control processes , 1999 .

[17] Xianping Guo,et al. Continuous-Time Controlled Markov Chains with Discounted Rewards , 2003 .

[18] S. Meyn,et al. Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes , 1993, Advances in Applied Probability.

[19] 郑少慧. CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE , 1991 .