Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints

This paper investigates the computation of transient-optimal policies in discrete dynamic programming. The model, is quite general: it may contain transient as well as nontransient policies. and the transition matrices are not necessarily substochastic.A functional equation for the so-called transient-value-vector is derived and the concept of superharmonicity is introduced. This concept provides the linear program to compute the transientvalue-vector and a transient-optimal policy.We also discuss the elimination of suboptimal actions, the solution of problems with additional constraints, and the computation of an efficient policy for a multiple objective dynamic programming problem.

[1]  D. Blackwell Discrete Dynamic Programming , 1962 .

[2]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[3]  J Jaap Wessels,et al.  Markov Decision Theory , 1979 .

[4]  van der J Jan Wal,et al.  Successive approximations for convergent dynamic programming , 1977 .

[5]  J Jaap Wessels,et al.  Markov decision processes with unbounded rewards , 1977 .

[6]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[7]  J. MacQueen,et al.  Letter to the Editor - A Test for Suboptimal Actions in Markovian Decision Problems , 1967, Oper. Res..

[8]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[9]  Uriel G. Rothblum,et al.  Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..

[10]  A. Hordijk,et al.  Linear Programming and Markov Decision Chains , 1979 .

[11]  L. C. M. Kallenberg,et al.  Linear programming and finite Markovian control problems , 1984 .

[12]  B. L. Miller,et al.  Discrete Dynamic Programming with a Small Interest Rate , 1969 .

[13]  Michel Loève,et al.  Probability Theory I , 1977 .

[14]  Samuel Karlin,et al.  Mathematical Methods and Theory in Games, Programming, and Economics , 1961 .

[15]  D. Faddeev,et al.  Computational methods of linear algebra , 1959 .

[16]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[17]  Uriel G. Rothblum,et al.  Normalized Markov Decision Chains I; Sensitive Discount Optimality , 1975, Oper. Res..

[18]  U. Rothblum Normalized Markov Decision Chains. II: Optimality of Nonstationary Policies , 1977 .

[19]  D. Faddeev,et al.  Computational Methods of Linear Algebra , 1959 .