Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Abstract.This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions presented here impose hypotheses specifically on the state space X, the action space A, the admissible action sets A(x),x∈X, the transition probability Q, and on the cost function c. Two of these conditions require mainly convexity assumptions, but the third one does not need this kind of assumptions. However, it needs certain stochastic order relations in Q, and the cost function c to reach its minimum with respect to the actions, just in one action. We illustrate the conditions with several examples including, in particular, discrete models, the linear regulator problem, and also a model of an inventory control system.

[1]  K. Stromberg Introduction to classical real analysis , 1981 .

[2]  J. Aubin Mathematical methods of game and economic theory , 1979 .

[3]  R. Bellman Dynamic programming. , 1957, Science.

[4]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[5]  Donald L. Iglehart,et al.  Capital Accumulation and Production for the Firm: Optimal Dynamic Policies , 1965 .

[6]  R. Ash,et al.  Real analysis and probability , 1975 .

[7]  Robert L. Smith,et al.  Conditions for the Existence of Planning Horizons , 1984, Math. Oper. Res..

[8]  Onésimo Hernández Lerma,et al.  Monotone approximations for convex stochastic control problems , 1992 .

[9]  T. Lindvall Lectures on the Coupling Method , 1992 .

[10]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[11]  Robert L. Smith,et al.  A New Optimality Criterion for Nonhomogeneous Markov Decision Processes , 1987, Oper. Res..

[12]  C. Bes,et al.  Concepts of Forecast and Decision Horizons: Applications to Dynamic Stochastic Optimization Problems , 1986, Math. Oper. Res..

[13]  E. B. Dynkin STOCHASTIC CONCAVE DYNAMIC PROGRAMMING , 1972 .

[14]  Robert L. Smith,et al.  Infinite horizon production planning in time varying systems with convex production and inventory costs Robert L. Smith and Rachel Q. Zhang. , 1998 .

[15]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[16]  J. Shapiro Turnpike Planning Horizons for a Markovian Decision Model , 1968 .

[17]  Suresh P. Sethi,et al.  Conditions for the Existence of Decision Horizons for Discounted Problems in a Stochastic Environment: A Note , 1985 .