Overtaking and Almost-Sure Optimality for Infinite Horizon Markov Decision Processes

We consider infinite horizon optimal control of Markov chains on complete metric spaces. We employ the overtaking optimality criterion, which is either applied to the expected cost-flow, or to the individual sample paths, yielding almost-sure optimality results. We use the existence of a solution pair Φ·, λ to the optimality equation LΦx = λ to establish and characterize optimal strategies. For finite state-spaces we derive sufficient, as well as necessary conditions for overtaking optimality.

[1]  A. F. Veinott ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .

[2]  J. Bather Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.

[3]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[4]  Arie Leizarwitz Infinite horizon optimization for finite state Markov chain , 1987 .

[5]  Alain Haurie,et al.  On Existence of Overtaking Optimal Trajectories Over an Infinite Time Horizon , 1976, Math. Oper. Res..

[6]  James Flynn Optimal steady states, excessive functions, and deterministic dynamic programs , 1989 .

[7]  D. Gale On Optimal Development in a Multi-Sector Economy , 1967 .

[8]  David Blackwell,et al.  Positive dynamic programming , 1967 .

[9]  J. Doob Stochastic processes , 1953 .

[10]  James Flynn On optimality criteria for dynamic programs with long finite horizons , 1980 .

[11]  Vladimir Rotar,et al.  On asymptotic optimality in probability and almost surely in dynamic control , 1990 .

[12]  B. L. Miller,et al.  An Optimality Condition for Discrete Dynamic Programming with no Discounting , 1968 .

[13]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[14]  von Weizäscker,et al.  Existence of Optimal Programs of Accumulation for an Infinite Time Horizon , 1965 .

[15]  E. A. Fainberg On Controlled Finite State Markov Processes with Compact Control Sets , 1976 .

[16]  O. Hernández-Lerma,et al.  Average cost Markov Decision Processes: Optimality conditions☆ , 1991 .

[17]  D. Blackwell Discrete Dynamic Programming , 1962 .

[18]  Uriel G. Rothblum,et al.  Overtaking Optimality for Markov Decision Chains , 1979, Math. Oper. Res..

[19]  O. Hernández-Lerma Average optimality in dynamic programming on Borel spaces: unbounded costs and controls , 1991 .

[20]  Alle Leizarowitz,et al.  On infinite products of stochastic matrices , 1992 .

[21]  Mukul Majumdar,et al.  CONTROLLED SEMI-MARKOV MODELS UNDER LONG-RUN AVERAGE REWARDS , 1989 .

[22]  R. Strauch Negative Dynamic Programming , 1966 .

[23]  Manfred Schäl,et al.  Average Optimality in Dynamic Programming with General State Space , 1993, Math. Oper. Res..

[24]  Steven A. Lippman Maximal Average-Reward Policies for Semi-Markov Decision Processes With Arbitrary State and Action Space , 1971 .

[25]  D A Carlson On the existence of catching-up optimal solutions for Lagrange problems defined on unbounded intervals , 1986 .

[26]  Paul J. Schweitzer,et al.  The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..

[27]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[28]  A. Haurie Optimal control on an infinite time horizon: The turnpike approach , 1976 .

[29]  S. Ross Arbitrary State Markovian Decision Processes , 1968 .