论文信息 - Policy Bounds for Markov Decision Processes

Policy Bounds for Markov Decision Processes

This paper demonstrates how a Markov decision process MDP can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states. We present sufficient conditions for several computationally attractive approximations to generate rigorous policy bounds. These approximations include approximating the optimal value function, replacing the original MDP with a separable approximate MDP, and approximating a stochastic MDP with its deterministic counterpart. An example from the field of fisheries management demonstrates the practical applicability of the results.

William S. Lovejoy

[1] Dimitri P. Bertsekas,et al. Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[2] D. J. White,et al. A Method for Approximate Solutions to Stochastic Dynamic Programming Problems Using Expectations , 1968, Oper. Res..

[3] Donald M. Topkis,et al. Minimizing a Submodular Function on a Lattice , 1978, Oper. Res..

[4] A. F. Veinott. Optimal Policy for a Multi-product, Dynamic Non-Stationary Inventory Problem , 1965 .

[5] Evan L. Porteus. Technical Note - An Adjustment to the Norman-White Approach to Approximating Dynamic Programs , 1979, Oper. Res..

[6] E. Denardo. Separable Markovian Decision Problems , 1968 .

[7] M. Schäl. A selection theorem for optimization problems , 1974 .

[8] Ward Whitt,et al. Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..

[9] Chelsea C. White,et al. Suboptimal Design for Large Scale, Multimodule Systems , 1981, Oper. Res..

[10] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .

[11] Thomas E. Morton,et al. Letter to the Editor - A Critique of the Norman-White Dynamic Programming Approximation , 1969, Oper. Res..

[12] E. Denardo. CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[13] M. J. Sobel. Production Smoothing with Stochastic Demand II: Infinite Horizon Case , 1969 .