Constrained Average Cost Markov Decision Chains

A Markov decision chain with denumerable state space incurs two types of costs — for example, an operating cost and a holding cost. The objective is to minimize the expected average operating cost, subject to a constraint on the expected average holding cost. We prove the existence of an optimal constrained randomized stationary policy, for which the two stationary policies differ on at most one state. The examples treated are a packet communication system with reject option and a single-server queue with service rate control.

[1]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .

[2]  Edwin Hewitt,et al.  Real And Abstract Analysis , 1967 .

[3]  E. Frid On Optimal Strategies in Control Problems with Constraints , 1972 .

[4]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[5]  Daniel P. Heyman,et al.  Stochastic models in operations research , 1982 .

[6]  Arie Hordijk,et al.  Constrained Undiscounted Stochastic Dynamic Programming , 1984, Math. Oper. Res..

[7]  F. Beutler,et al.  Optimal policies for controlled markov chains with a constraint , 1985 .

[8]  F. Beutler,et al.  Time-average optimal constrained semi-Markov decision processes , 1986, Advances in Applied Probability.

[9]  A. Makowski,et al.  Estimation and optimal control for constrained Markov chains , 1986, 1986 25th IEEE Conference on Decision and Control.

[10]  Linn I. Sennott,et al.  Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs , 1989, Oper. Res..

[11]  E. Altman,et al.  Markov decision problems and state-action frequencies , 1991 .

[12]  Linn I. Sennott,et al.  Constrained Discounted Markov Decision Chains , 1991, Probability in the Engineering and Informational Sciences.

[13]  Rolando Cavazos-Cadena Solution to the optimality equation in a class of Markov decision chains with the average cost criterion , 1991, Kybernetika.

[14]  Rolando Cavazos-Cadena,et al.  Comparing recent assumptions for the existence of average optimal stationary policies , 1992, Oper. Res. Lett..

[15]  L. Sennott The Average Cost Optimality Equation and Critical Number Policies , 1993 .

[16]  Armand M. Makowski,et al.  On constrained optimization of the Klimov network and related Markov decision processes , 1993, IEEE Trans. Autom. Control..

[17]  V. Borkar Ergodic Control of Markov Chains with Constraints---The General Case , 1994 .