Error bounds for rolling horizon policies in discrete-time Markov control processes

Error bounds are presented for rolling horizon (RH) policies in general, stationary and nonstationary, (Borel) Markov control problems with both discounted and average reward criteria. In each of these cases, conditions are given under which the reward of the rolling horizon policy converges geometrically to the optimal reward function, uniformly in the initial state, as the length of the rolling horizon increases. A description of the control model and the general assumptions are given. The approach is based on extending the results of J.M. Alden and A.R.L. Smith (1988) on nonstationary processes with finite state and action spaces. However the proofs presented are simpler. This is because, when stationary models are analyzed first, the error bounds follow more or less directly from well-known value iteration results. The corresponding error bounds for nonstationary models are obtained by reducing these models to stationary ones. >

[1]  Onésimo Hernández-Lerma,et al.  Controlled Markov Processes , 1965 .

[2]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[3]  D. Kleinman,et al.  An easy way to stabilize a linear constant system , 1970 .

[4]  Evan L. Porteus Bounds and Transformations for Discounted Finite Markov Decision Chains , 1975, Oper. Res..

[5]  Keigo Yamada Duality theorem in Markovian decision problems , 1975 .

[6]  Manfred SchÄl,et al.  Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal , 1975 .

[7]  T. Parthasarathy,et al.  Optimal Plans for Dynamic Programming Problems , 1976, Math. Oper. Res..

[8]  G. Hübner,et al.  On the Fixed Points of the Optimal Reward Operator in Stochastic Dynamic Programming with Discount Factor Greater than One , 1976 .

[9]  K. Hinderer,et al.  An Improvement of J. F. Shapiro’s Turnpike Theorem for the Horizon of Finite Stage Discrete Dynamic Programs , 1977 .

[10]  J. P. Georgin,et al.  Estimation et controle des chaines de Markov sur des espaces arbitraires , 1978 .

[11]  K. R. Baker,et al.  An Analytic Framework for Evaluating Rolling Schedules , 1979 .

[12]  G. Hübner,et al.  Bounds and good policies in stationary finite–stage Markovian decision problems , 1980, Advances in Applied Probability.

[13]  Leif Johansen,et al.  Lectures on macroeconomic planning , 1980 .

[14]  Stochastic Equilibrium and Optimality with Rolling Plans , 1981 .

[15]  T. Kailath,et al.  Stabilizing state-feedback design via the moving horizon method , 1982, 1982 21st IEEE Conference on Decision and Control.

[16]  R. Cavazos-Cadena Finite-state approximations for denumerable state discounted markov decision processes , 1986 .

[17]  P. L’Ecuyer,et al.  Approximation and bounds in discrete event dynamic programming , 1986 .

[18]  Chung-Yee Lee,et al.  Rolling Planning Horizons: Error Bounds for the Dynamic Lot Size Model , 1986, Math. Oper. Res..

[19]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[20]  A. Mokkadem SUR UN MODÉLE AUTORÉGRESSIF NON LINÉAIRE, ERGODICITÉ ET ERGODICITÉ GÉOMÉTRIQUE , 1987 .

[21]  Schäl Manfred Estimation and control in discounted stochastic dynamic programming , 1987 .

[22]  H. Michalska RECEDING HORIZON CONTROL OF NON-LINEAR SYSTEMS , 1988 .

[23]  O. Hernández-Lerma,et al.  A forecast horizon and a stopping rule for general Markov decision processes , 1988 .

[24]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[25]  O. Hernández-Lerma,et al.  Recurrence conditions for Markov decision processes with Borel state space: A survey , 1991 .

[26]  Raymond L. Smith,et al.  Rolling Horizon Procedures in Nonhomogeneous Markov Decision Processes , 1992, Oper. Res..