论文信息 - Value iteration in countable state average cost Markov decision processes with unbounded costs

Value iteration in countable state average cost Markov decision processes with unbounded costs

We deal with countable state Markov decision processes with finite action sets and (possibly) unbounded costs. Assuming the existence of an expected average cost optimal stationary policyf, with expected average costg, when canf andg be found using undiscounted value iteration? We give assumptions guaranteeing the convergence of a quantity related tong−Νn(i), whereΝn(i) is the minimum expectedn-stage cost when the process starts in statei. The theory is applied to a queueing system with variable service rates and to a queueing system with variable arrival parameter.

Linn I. Sennott | L. Sennott

[1] L. Sennott. A new condition for the existence of optimal stationary policies in average cost Markov decision processes , 1986 .

[2] Linn I. Sennott,et al. Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs , 1989, Oper. Res..

[3] R. Bellman. Dynamic programming. , 1957, Science.

[4] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .

[5] Linn I. Sennott,et al. Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems , 1989, Probability in the Engineering and Informational Sciences.

[6] L. Sennott. A new condition for the existence of optimum stationary policies in average cost Markov decision processes - Unbounded cost case , 1986, 1986 25th IEEE Conference on Decision and Control.

[7] P. Schweitzer,et al. The asymptotic behaviour of the minimal total expected cost for the denumerable state Markov decision model : (prepublication) , 1975 .

[8] R. Tweedie. Hitting times of Markov chains, with application to state-dependent queues , 1977, Bulletin of the Australian Mathematical Society.

[9] Paul J. Schweitzer,et al. The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..

[10] A. G. Pakes,et al. Some Conditions for Ergodicity and Recurrence of Markov Chains , 1969, Oper. Res..

[11] D. White,et al. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .