On the value function in constrained control of Markov chains

It is known that the value function in an unconstrained Markov decision process with finitely many states and actions is a piecewise rational function in the discount factor a, and that the value function can be expressed as a Laurent series expansion about α = 1 for α close enough to 1. We show in this paper that this property also holds for the value function of Markov decision processes with additional constraints. More precisely, we show by a constructive proof that there are numbers O = αo <α1 <... < αm−1 < αm = 1 such that for everyj = 1, 2, ...,m − 1 either the problem is not feasible for all discount factors α in the open interval (αj−1, αj) or the value function is a rational function in a in the closed interval [αj−1, αj]. As a consequence, if the constrained problem is feasible in the neighborhood of α = 1, then the value function has a Laurent series expansion about α = 1. Our proof technique for the constrained case provides also a new proof for the unconstrained case.

[1]  A. Hordijk,et al.  Constrained admission control to a queueing system , 1989, Advances in Applied Probability.

[2]  Keith W. Ross,et al.  Optimal priority assignment with hard constraint , 1986 .

[3]  F. Beutler,et al.  Time-average optimal constrained semi-Markov decision processes , 1986, Advances in Applied Probability.

[4]  Richard D. Smallwood,et al.  Optimum Policy Regions for Markov Processes with Discounting , 1966, Oper. Res..

[5]  Keith W. Ross,et al.  Markov Decision Processes with Sample Path Constraints: The Communicating Case , 1989, Oper. Res..

[6]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[7]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[8]  Adam Shwartz,et al.  Optimal priority assignment: a time sharing approach , 1989 .

[9]  Eitan Altman,et al.  Sensitivity of constrained Markov decision processes , 1991, Ann. Oper. Res..

[10]  B. L. Miller,et al.  Discrete Dynamic Programming with a Small Interest Rate , 1969 .

[11]  E. Altman,et al.  Stability and singular perturbations in constrained Markov decision problems , 1993, IEEE Trans. Autom. Control..

[12]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[13]  E. Rafajłowicz,et al.  On optimal global rate of convergence of some nonparametric identification procedures , 1989 .

[14]  Eitan Altman,et al.  Time-Sharing Policies for Controlled Markov Chains , 1993, Oper. Res..

[15]  Linn I. Sennott,et al.  Constrained Average Cost Markov Decision Chains , 1993, Probability in the Engineering and Informational Sciences.

[16]  Eitan Altman,et al.  Denumerable Constrained Markov Decision Processes and Finite Approximations , 1994, Math. Oper. Res..

[17]  Rommert Dekker,et al.  Sensitivity-analysis in discounted Markovian decision problems , 1985 .

[18]  Eugene A. Feinberg,et al.  Constrained Semi-Markov decision processes with average rewards , 1994, Math. Methods Oper. Res..

[19]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[20]  Arie Hordijk,et al.  Constrained Undiscounted Stochastic Dynamic Programming , 1984, Math. Oper. Res..

[21]  F. Beutler,et al.  Optimal policies for controlled markov chains with a constraint , 1985 .

[22]  G. Zoutendijk,et al.  Mathematical Programming Methods , 1976 .

[23]  Keith W. Ross,et al.  Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach , 1991, Math. Oper. Res..

[24]  Arie Hordijk,et al.  Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints , 1984, Math. Program..

[25]  R. Bellman Dynamic programming. , 1957, Science.

[26]  Elon Kohlberg,et al.  The Asymptotic Theory of Stochastic Games , 1976, Math. Oper. Res..

[27]  Linn I. Sennott,et al.  Constrained Discounted Markov Decision Chains , 1991, Probability in the Engineering and Informational Sciences.

[28]  L. C. M. Kallenberg,et al.  Linear programming and finite Markovian control problems , 1984 .

[29]  A. F. Veinott ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .

[30]  D. Blackwell Discrete Dynamic Programming , 1962 .