Stable LInear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions
暂无分享,去创建一个
We consider the solution to large stochastic control problems by means of methods that rely on compact representations and a variant of the value iteration algorithm to compute approximate cost-togo functions. While such methods are known to be unstable in general, we identify a new class of problems for which convergence, as well as graceful error bounds, are guaranteed. This class involves linear parameterizations of the cost-to-go function together with an assumption that the dynamic programming operator is a contraction with respect to the Euclidean norm when applied to functions in the parameterized class. We provide a special case where this assumption is satisfied, which relies on the locality of transitions in a state space. Other cases will be discussed in a full length version of this paper.
[1] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[2] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Krzysztof J. Cios,et al. Advances in neural information processing systems 7 , 1997 .