论文信息 - Adaptive aggregation methods for discounted dynamic programming

Adaptive aggregation methods for discounted dynamic programming

We propose a class of iterative aggregation algorithms for solving discounted dynamic programming problems. The idea is to interject aggregation iterations in the course of the usual successive approximation method. An important new feature that sets our method apart from earlier proposals is that the aggregate groups of states change adaptively from one aggregation iteration to the next, depending on the progress of the computation. This allows acceleration of convergence in difficult problems involving multiple ergodic classes for which methods using fixed groups of aggregate states are ineffective. No knowledge of special problem structure is utilized by the algorithms.

D. Bertsekas | D. Castañón

[1] J. MacQueen. A MODIFIED DYNAMIC PROGRAMMING METHOD FOR MARKOVIAN DECISION PROBLEMS , 1966 .

[2] Harold J. Kushner,et al. Accelerated procedures for the solution of discrete Markov control problems , 1971 .

[3] Evan L. Porteus. Some Bounds for Discounted Sequential Decision Processes , 1971 .

[4] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .

[5] W. Miranker,et al. Acceleration by aggregation of successive approximation methods , 1982 .

[6] Martin L. Puterman,et al. Action Elimination Procedures for Modified Policy Iteration Algorithms , 1982, Oper. Res..

[7] Roy Mendelssohn,et al. An Iterative Aggregation Procedure for Markov Decision Processes , 1982, Oper. Res..

[8] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .