论文信息 - Finite-state approximations for denumerable state discounted markov decision processes

Finite-state approximations for denumerable state discounted markov decision processes

A finite-state iterative scheme introduced by White [9] to approximate the optimal value function of denumerable-state Markov decision processes with bounded rewards, is extended to the case of unbounded rewards. Convergence theorems that, when applied to the case of bounded rewards, give stronger results than those in [9] are proved. Moreover, bounds on the rates of convergence under several assumptions are given and the extended scheme is used to obtain policies with asymptotic optimality properties.

R. Cavazos-Cadena

[1] B. Fox. Finite-state approximations to denumerable-state dynamic programs , 1971 .

[2] J. Harrison. Discrete Dynamic Programming with Unbounded Rewards , 1972 .

[3] J. Wessels. Markov programming by successive approximations by respect to weighted supremum norms , 1976, Advances in Applied Probability.

[4] S. Lippman. On Dynamic Programming with Unbounded Rewards , 1975 .

[5] D. White. Finite state approximations for denumerable state infinite horizon discounted Markov decision processes with unbounded rewards , 1982 .

[6] S. Marcus,et al. Adaptive control of discounted Markov decision chains , 1985 .

[7] O. Hernández-Lerma. Finite-state approximations for denumerable multidimensional state discounted Markov decision processes , 1986 .

[8] Schäl Manfred. Estimation and control in discounted stochastic dynamic programming , 1987 .