Approximations of Dynamic Programs, II

A general procedure is presented for constructing and analyzing approximations of dynamic programming models. The models considered are the monotone contraction operator models of Denardo (1967), which include Markov decision processes and stochastic games with a criterion of discounted present value over an infinite horizon plus many finite-stage dynamic programs. The approximations are typically achieved by replacing the original state and action spaces by subsets. Tight bounds are obtained for the distances between the optimal return function in the original model and (1) the extension of the optimal return function in the approximate mode! and (2) the return function associated with the extension of an optimal pohcy in the approximate model. Conditions are also given under which the sequence of bounds associated with a sequence of approximating models converges to zero.