论文信息 - A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning

A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning

CEMAGREF, LISC, Parc de Tourvoie, B.P.121, 92185 Antony Cedex, FRANCE. Tel : (33-1) 40 96 61 21 Poste 64 14 Fax : (33-1) 40 96 60 80 the time are continuous and we propose a reinforcement learning algorithm, called Finite-Element Reinforcement Learning (FERL), that converges to the optimal solution. An adequate formalism using the continuous optimal control framework is given. The Hamilton-Jacobi-Bellman (HJB) equation is stated. Then some problems due to the resolution of the HJB equation are underlined and references to viscosity solutions are indicated.

Rémi Munos | R. Munos

[1] P. Lions,et al. Viscosity solutions of Hamilton-Jacobi equations , 1983 .

[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3] P. Souganidis. Approximation schemes for viscosity solutions of Hamilton-Jacobi equations , 1985 .

[4] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[5] G. Barles,et al. Comparison principle for dirichlet-type Hamilton-Jacobi equations and singular perturbations of degenerated elliptic equations , 1990 .

[6] G. Barles,et al. Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.

[7] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[8] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .

[9] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[10] Eduardo D. Sontag,et al. Neural Networks for Control , 1993 .

[11] G. Barles. Solutions de viscosité des équations de Hamilton-Jacobi , 1994 .

[12] M. James. Controlled markov processes and viscosity solutions , 1994 .

[13] A. Harry Klopf,et al. Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..

[14] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.