论文信息 - Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems

Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems

This paper presents a reinforcement learning algorithm designed for solving optimal control problems for which the state space and the time are continuous variables. Like Dynamic Programming methods, reinforcement learning techniques generate an optimal feed-back policy by the mean of the value function which estimates the best expectation of cumulative reward as a function of initial state. The algorithm proposed here uses finite-elements methods for approximating this function. It is composed of two dynamics: the learning dynamics, called Finite-Element Reinforcement Learning, which estimates the values at the vertices of a triangulation defined upon the state space, and the structural dynamics, which refines the triangulation inside regions where the value function is irregular. This mesh refinement algorithm intends to solve the problem of the combinatorial explosion of the number of values to be estimated. A formalism for reinforcement learning in the continuous case is proposed, the Hamilton-Jacobi-Bellman equation is stated, then the algorithm is presented and applied to a simple two-dimensional target problem.

Rémi Munos | R. Munos

[1] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[2] Richard S. Sutton,et al. Neural networks for control , 1990 .

[3] A. Harry Klopf,et al. Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..

[4] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[5] Jim Ruppert,et al. A Delaunay Refinement Algorithm for Quality 2-Dimensional Mesh Generation , 1995, J. Algorithms.

[6] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .

[7] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[8] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[9] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .

[10] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[11] Rémi Munos,et al. A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning , 1996, ICML.

[12] D. F. Watson. Computing the n-Dimensional Delaunay Tesselation with Application to Voronoi Polytopes , 1981, Comput. J..