Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs

The study of infinite-horizon nonstationary dynamic programs using the operator approach is continued. The point of view here differs slightly from that taken by others, in that Denardo's local income function is not used as a starting point. Infinite-horizon values are defined as limits of finite-horizon values, as the horizons get long. Two important conditions of an earlier paper are weakened, yet the optimality equations, the optimality criterion, and the existence of optimal “structured” strategies are still obtained.

[1]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[2]  D. Blackwell Discounted Dynamic Programming , 1965 .

[3]  R. Bellman Dynamic programming. , 1957, Science.

[4]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[5]  David Blackwell,et al.  Positive dynamic programming , 1967 .

[6]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[7]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[8]  Evan L. Porteus Some Bounds for Discounted Sequential Decision Processes , 1971 .

[9]  B. Fox Finite-state approximations to denumerable-state dynamic programs , 1971 .

[10]  D. Blackwell,et al.  The Optimal Reward Operator in Dynamic Programming , 1974 .

[11]  D. Freedman The Optimal Reward Operator in Special Classes of Dynamic Programming Problems , 1974 .

[12]  J. Wessels Markov programming by successive approximations by respect to weighted supremum norms , 1976, Advances in Applied Probability.

[13]  Evan L. Porteus Bounds and Transformations for Discounted Finite Markov Decision Chains , 1975, Oper. Res..

[14]  Evan L. Porteus On the Optimality of Structured Policies in Countable Stage Decision Processes , 1975 .

[15]  Manfred SchÄl,et al.  Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal , 1975 .

[16]  J Jaap Wessels,et al.  Markov decision processes with unbounded rewards , 1977 .

[17]  D. Bertsekas Monotone Mappings with Application in Dynamic Programming , 1977 .

[18]  Evan L. Porteus,et al.  On the Optimality of Structured Policies in Countable Stage Decision Processes. II: Positive and Negative Problems , 1977 .

[19]  Evan L. Porteus On Optimal Dividend, Reinvestment, and Liquidation Policies for the Firm , 1977, Oper. Res..

[20]  M. Schäl AN OPERATOR-THEORETICAL TREATMENT OF NEGATIVE DYNAMIC PROGRAMMING , 1978 .

[21]  P. Schweitzer Contraction mappings underlying undiscounted Markov decision problems—II , 1978 .

[22]  Ward Whitt,et al.  Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..

[23]  Dimitri P. Bertsekas,et al.  Universally Measurable Policies in Dynamic Programming , 1979, Math. Oper. Res..

[24]  Evan L. Porteus,et al.  Dynamic Choice Theory and Dynamic Programming , 1979 .

[25]  Willem K. Klein Haneveld On the Behavior of the Optimal Value Operator of Dynamic Programming , 1980, Math. Oper. Res..