Function approximation for the deterministic Hamilton-Jacobi-Bellman equation

Based on Gaussian basis functions, a new method for calculating the Hamilton-Jacobi-Bellman equation for deterministic continuous-time and continuous-valued optimal control problems is proposed. A semi-Lagrangian discretization scheme is used to obtain a discrete-time finite-state approximation of the continuous dynamics. The value function of the discretized system is approximated by a Gaussian network. Limit behavior analysis provides a proof of convergence for the scheme. The performance of the presented approach is demonstrated for an underpowered inverted pendulum as numerical example. Furthermore, a comparison to the approximation by continuous piecewise affine functions (the current state of the art) shows the benefits of the approximation technique proposed here.

[1]  Rémi Munos,et al.  A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions , 2000, Machine Learning.

[2]  James W. Daniel,et al.  Splines and efficiency in dynamic programming , 1976 .

[3]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[4]  Scott Davies,et al.  Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.

[5]  L. Grüne,et al.  Global Optimal Control of Perturbed Systems , 2007 .

[6]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[7]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[8]  D. Moore Simplicial Mesh Generation with Applications , 1992 .

[9]  Hitoshi Ishii,et al.  A boundary value problem of the Dirichlet type for Hamilton-Jacobi equations , 1989 .

[10]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  M. Bardi,et al.  Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations , 1997 .

[12]  Rémi Munos,et al.  Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..

[13]  G. Barles,et al.  Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.

[14]  I. Dolcetta On a discrete approximation of the Hamilton-Jacobi equation of dynamic programming , 1983 .

[15]  G. Barles,et al.  Comparison principle for dirichlet-type Hamilton-Jacobi equations and singular perturbations of degenerated elliptic equations , 1990 .

[16]  Andrew W. Moore,et al.  Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.

[17]  L. Grüne,et al.  Adaptive spline interpolation for Hamilton-Jacobi-Bellman equations , 2006 .

[18]  L. Grüne An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation , 1997 .