Denumerable semi-Markov decision chains with small interest rates

In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo [11] and Dekker and Hordijk [7].

[1]  J. Harrison Discrete Dynamic Programming with Unbounded Rewards , 1972 .

[2]  Manfred SchÄl,et al.  Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal , 1975 .

[3]  Hans Deppe,et al.  On the Existence of Average Optimal Policies in Semiregenerative Decision Models , 1984, Math. Oper. Res..

[4]  R. Weber,et al.  Optimal control of service rates in networks of queues , 1987, Advances in Applied Probability.

[5]  S. Ross NON-DISCOUNTED DENUMERABLE MARKOVIAN DECISION MODELS , 1968 .

[6]  E. Denardo Markov Renewal Programs with Small Interest Rates , 1971 .

[7]  J. Bather Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.

[8]  J Jaap Wessels,et al.  Markov decision processes with unbounded rewards , 1977 .

[9]  Flos Spieksma,et al.  The existence of sensitive optimal policies in two multi-dimensional queueing models , 1991 .

[10]  Antoine-S Bailly,et al.  Science régionale - Walter Isard, Introduction to régional science. Englewood Cliffs (NJ), Prentice-Hall, 1975 , 1975 .

[11]  William S. Jewell,et al.  Markov-Renewal Programming. I: Formulation, Finite Return Models , 1963 .

[12]  Martin L. Puterman,et al.  Contracting Markov Decision Processes. (Mathematical Centre Tract 71.) , 1978 .

[13]  Erhan Çinlar,et al.  Introduction to stochastic processes , 1974 .

[14]  D. Blackwell Discrete Dynamic Programming , 1962 .

[15]  Shaler Stidham,et al.  Monotonic and Insensitive Optimal Policies for Control of Queues with Undiscounted Costs , 1989, Oper. Res..

[16]  G. Klimov Time-Sharing Service Systems. I , 1975 .

[17]  J. A. E. E. van Nunen Contracting Markov decision processes , 1976 .

[18]  A. Federgruen,et al.  The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms , 1978 .

[19]  Arie Hordijk,et al.  Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards , 1988, Math. Oper. Res..

[20]  B. L. Miller Finite state continuous time Markov decision processes with an infinite planning horizon , 1968 .

[21]  B. L. Miller,et al.  Discrete Dynamic Programming with a Small Interest Rate , 1969 .

[22]  Linn I. Sennott,et al.  Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs , 1989, Oper. Res..

[23]  W. Jewell Markov-Renewal Programming. II: Infinite Return Models, Example , 1963 .

[24]  H. Zijm THE OPTIMALITY EQUATIONS IN MULTICHAIN DENUMERABLE STATE MARKOV DECISION PROCESSES WITH THE AVERAGE COST CRITERION: THE BOUNDED COST CASE MULTISTAGE BAYESIAN ACCEPTANCE SAMPLING: OPTIMALITY OF A (z,c",c'^)-SAMPLING PLAN IN GASE OF A POLYA PRIOR DISTRIBUTION , 1985 .

[25]  D. Blackwell Discounted Dynamic Programming , 1965 .

[26]  A. Federgruen,et al.  Denumerable state semi-markov decision processes with unbounded costs, average cost criterion : (preprint) , 1979 .

[27]  Rommert Dekker,et al.  Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[28]  C. Derman DENUMERABLE STATE MARKOVIAN DECISION PROCESSES: AVERAGE COST CRITERION. , 1966 .

[29]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[30]  J. B. Lasserre,et al.  Conditions for Existence of Average and Blackwell Optimal Stationary Policies in Denumerable Markov Decision Processes , 1988 .

[31]  Manfred Schäl,et al.  On the Second Optimality Equation for Semi-Markov Decision Models , 1992, Math. Oper. Res..

[32]  Paul J. Schweitzer,et al.  Denumerable Undiscounted Semi-Markov Decision Processes with Unbounded Rewards , 1983, Math. Oper. Res..

[33]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .

[34]  Linn I. Sennott,et al.  Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems , 1989, Probability in the Engineering and Informational Sciences.

[35]  Karel Sladký,et al.  Sensitive Optimality Criteria in Countable State Dynamic Programming , 1977, Math. Oper. Res..

[36]  P. Schweitzer Perturbation theory and Markovian decision processes. , 1965 .

[37]  Elke Mann,et al.  Optimality equations and sensitive optimality in bounded Markov decision processes 1 , 1985 .

[38]  B. L. Miller,et al.  An Optimality Condition for Discrete Dynamic Programming with no Discounting , 1968 .

[39]  J. S. D. Cani A Dynamic Programming Algorithm for Embedded Markov Chains when the Planning Horizon is at Infinity , 1964 .