论文信息 - RAPID CONVERGENCE TECHNIQUES FOR MARKOV DECISION PROCESSES

RAPID CONVERGENCE TECHNIQUES FOR MARKOV DECISION PROCESSES

When a person is working with large scale Markov Decision Processes, he normally uses the policy iteration approach developed by Howard [1] and modified by White [3]. White's modification makes use of the method of successive approximations. Computational experience has shown that for many processes, the rate of convergence of the successive approximation is very slow. In this paper, techniques for speeding convergence are discussed. Numerical examples and computational experience which show the relative merits of the various approaches are presented.

Thom J. Hodgson | Miguel Zaldivar

[1] Thomas E. Morton. Technical Note - On the Asymptotic Convergence Rate of Cost Differences for Markovian Decision Processes , 1971, Oper. Res..

[2] D. White,et al. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .