Nearly optimal HJB solution for constrained input systems using a neural network least-squares approach

We consider the use of nonlinear networks towards obtaining nearly optimal solutions to constrained control problems. The method is based on least-squares successive approximation solution of the generalized HJB (Hamilton-Jacobi-Bellman) equation which appears in optimization problems. Successive approximation using the GHJB has not yet been applied for bounded controls. The proposed method successively solves the GHJB equation on a well-defined region of attraction making use of a suitable nonquadratic functional that allows us to work with smooth bounded controls. A neural network is used to approximate the GHJB solution. It is shown that the result is a closed-loop control based on a neural net that has been tuned a priori off-line. The control law structure is shown to have the largest possible region of asymptotic stability. As the order of the network is increased, and as the algorithm is run on more points in the well-defined region of attraction, it is shown that the network converges to the solution of the inherently nonlinear HJB equation associated with the bounded control.

[1]  George N. Saridis,et al.  An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[3]  G. Saridis,et al.  Journal of Optimization Theory and Applications Approximate Solutions to the Time-invariant Hamilton-jacobi-bellman Equation 1 , 1998 .

[4]  Sergey Edward Lyshevski Control Systems Theory with Engineering Applications , 2001 .

[5]  Andrew W. Moore,et al.  Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[6]  George G. Lendaris,et al.  Globally convergent approximate dynamic programming applied to an autolander , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[7]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[8]  Frank L. Lewis,et al.  Intelligent optimal control of robotic manipulators using neural networks , 2000, Autom..

[9]  S. Lyashevskiy,et al.  Control system analysis and design upon the Lyapunov method , 1995, Proceedings of 1995 American Control Conference - ACC'95.

[10]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[11]  John N. Tsitsiklis,et al.  Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.