论文信息 - Constrained Semi-Markov decision processes with average rewards

Constrained Semi-Markov decision processes with average rewards

This paper deals with constrained average reward Semi-Markov Decision Processes (SMDPs) with finite state and action sets. We consider two average reward criteria. The first criterion is time-average rewards, which equal the lower limits of the expected average rewards per unit time, as the horizon tends to infinity. The second criterion is ratio-average rewards, which equal the lower limits of the ratios of the expected total rewards during the firstn steps to the expected total duration of thesen steps asn → ∞. For both criteria, we prove the existence of optimal mixed stationary policies for constrained problems when the constraints are of the same nature as the objective functions. For unichain problems, we show the existence of randomized stationary policies which are optimal for both criteria. However, optimal mixed stationary policies may be different for each of these critria even for unichain problems. We provide linear programming algorithms for the computation of optimal policies.

Eugene A. Feinberg | E. Feinberg

[1] Eitan Altman,et al. Denumerable Constrained Markov Decision Problems and Finite Approximations Denumerable Constrained Markov Decision Problems and Finite Approximations , 1992 .

[2] B. Fox. (g, w)—Optima in Markov Renewal Programs , 1968 .

[3] Eitan Altman,et al. Sensitivity of constrained Markov decision processes , 1991, Ann. Oper. Res..

[4] F. Beutler,et al. Optimal policies for controlled markov chains with a constraint , 1985 .

[5] Keith W. Ross,et al. Randomized and Past-Dependent Policies for Markov Decision Processes with Multiple Constraints , 1989, Oper. Res..

[6] Arie Hordijk,et al. Dynamic programming and Markov potential theory , 1974 .

[7] A. Shwartz,et al. Adaptive control of constrained Markov chains , 1991 .

[8] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[9] W RossKeith,et al. Markov Decision Processes with Sample Path Constraints , 1989 .

[10] L. C. M. Kallenberg,et al. Linear programming and finite Markovian control problems , 1984 .

[11] E. Denardo,et al. Multichain Markov Renewal Programs , 1968 .