A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning

CEMAGREF, LISC, Parc de Tourvoie, B.P.121, 92185 Antony Cedex, FRANCE. Tel : (33-1) 40 96 61 21 Poste 64 14 Fax : (33-1) 40 96 60 80 the time are continuous and we propose a reinforcement learning algorithm, called Finite-Element Reinforcement Learning (FERL), that converges to the optimal solution. An adequate formalism using the continuous optimal control framework is given. The Hamilton-Jacobi-Bellman (HJB) equation is stated. Then some problems due to the resolution of the HJB equation are underlined and references to viscosity solutions are indicated.

[1]  P. Lions,et al.  Viscosity solutions of Hamilton-Jacobi equations , 1983 .

[2]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  P. Souganidis Approximation schemes for viscosity solutions of Hamilton-Jacobi equations , 1985 .

[4]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[5]  G. Barles,et al.  Comparison principle for dirichlet-type Hamilton-Jacobi equations and singular perturbations of degenerated elliptic equations , 1990 .

[6]  G. Barles,et al.  Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.

[7]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[8]  Vijaykumar Gullapalli,et al.  Reinforcement learning and its application to control , 1992 .

[9]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[10]  Eduardo D. Sontag,et al.  Neural Networks for Control , 1993 .

[11]  G. Barles Solutions de viscosité des équations de Hamilton-Jacobi , 1994 .

[12]  M. James Controlled markov processes and viscosity solutions , 1994 .

[13]  A. Harry Klopf,et al.  Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..

[14]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.