Rolling Horizon Procedures in Nonhomogeneous Markov Decision Processes

By far the most common planning procedure found in practice is to approximate the solution to an infinite horizon problem by a series of rolling finite horizon solutions. Although many empirical studies have been done, this so-called rolling horizon procedure has been the subject of few analytic studies. We provide a cost error bound for a general rolling horizon algorithm when applied to infinite horizon nonhomogeneous Markov decision processes, both in the discounted and average cost cases. We show that a Doeblin coefficient of ergodicity acts much like a discount factor to reduce this error. In particular, we show that the error goes to zero for any fixed rolling horizon as this Doeblin measure of control over the future decreases. The theory is illustrated through an application to vehicle deployment.

[1]  S. Ross Arbitrary State Markovian Decision Processes , 1968 .

[2]  Marius Iosifescu,et al.  ON TWO RECENT PAPERS ON ERGODICITY IN NONHOMOGENEOUS MARKOV CHAINS , 1972 .

[3]  Chung-Yee Lee,et al.  Rolling Planning Horizons: Error Bounds for the Dynamic Lot Size Model , 1986, Math. Oper. Res..

[4]  Robert L. Smith,et al.  Conditions for the Existence of Planning Horizons , 1984, Math. Oper. Res..

[5]  Evan L. Porteus Bounds and Transformations for Discounted Finite Markov Decision Chains , 1975, Oper. Res..

[6]  David Assaf INVARIANT PROBLEMS IN DISCOUNTED DYNAMIC PROGRAMMING , 1978 .

[7]  Robert L. Smith,et al.  A New Optimality Criterion for Nonhomogeneous Markov Decision Processes , 1987, Oper. Res..

[8]  Robert L. Smith,et al.  Aggregation in Dynamic Programming , 1987, Oper. Res..

[9]  Daniel P. Heyman,et al.  Stochastic models in operations research , 1982 .

[10]  G. Hübner Improved Procedures for Eliminating Suboptimal Actions in Markov Programming by the Use of Contraction Properties , 1977 .

[11]  S. Ross NON-DISCOUNTED DENUMERABLE MARKOVIAN DECISION MODELS , 1968 .

[12]  Robert L. Smith,et al.  Conditions for the discovery of solution horizons , 1993, Math. Program..

[13]  G. Hübner,et al.  On the Fixed Points of the Optimal Reward Operator in Stochastic Dynamic Programming with Discount Factor Greater than One , 1976 .

[14]  Suresh P. Sethi,et al.  Conditions for the Existence of Decision Horizons for Discounted Problems in a Stochastic Environment: A Note , 1985 .

[15]  Robert L. Smith,et al.  Optimal average value convergence in nonhomogeneous Markov decision processes Yunsun Park, James C. Bean and Robert L. Smith. , 1993 .

[16]  Evan L. Porteus An Informal Look at the Principle of Optimality , 1975 .