论文信息 - Two coupled neural-networks-based solution of the Hamilton-Jacobi-Bellman equation

Two coupled neural-networks-based solution of the Hamilton-Jacobi-Bellman equation

Abstract: This work is aimed at looking into the determination of optimal neuro-feedback control for discrete time nonlinear systems. The basic idea consists in the use of two coupled neural networks to approximate the solution of the Hamilton-Jacobi-Bellman equation (HJB) and to obtain a robust feedback closed-loop control law. The used learning algorithm is a modified version of the backpropagation one. As an illustration, a numerical nonlinear discrete time example is considered. Simulation results show the effectiveness of the proposed method.

[1] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[2] Richard S. Sutton,et al. Neural networks for control , 1990 .

[3] Andrew W. Moore,et al. Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[4] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[5] Bernard Widrow,et al. Neural dynamic optimization for control systems. I. Background , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[6] Bernard Widrow,et al. Neural dynamic optimization for control systems.II. Theory , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[7] A. Sideris,et al. A multilayered neural network controller , 1988, IEEE Control Systems Magazine.

[8] Zheng Chen,et al. Neural Network -based Nearly Optimal Hamilton-Jacobi-Bellman Solution for Affine Nonlinear Discrete-Time Systems , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[9] David M. Skapura,et al. Neural networks - algorithms, applications, and programming techniques , 1991, Computation and neural systems series.

[10] Nikita A. Visnevski. Control of a nonlinear multivariable system with adaptive critic designs , 1997 .

[11] V. Tikhomirov. On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of one Variable and Addition , 1991 .

[12] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[13] S. F. R. F. Stengel. 3 Model-Based Adaptive Critic Designs , 2004 .

[14] F. Fairman. Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.

[15] S. Lyshevski. Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[16] Robert E. Kalaba,et al. Dynamic Programming and Modern Control Theory , 1966 .

[17] D. Kleinman. On an iterative technique for Riccati equation computations , 1968 .

[18] Frank L. Lewis,et al. Fixed-Final Time Constrained Optimal Control of Nonlinear Systems Using Neural Network HJB Approach , 2006, CDC.

[19] Zvi Shiller,et al. Optimal obstacle avoidance based on the Hamilton-Jacobi-Bellman equation , 1994, IEEE Trans. Robotics Autom..

[20] Hecht-Nielsen. Theory of the backpropagation neural network , 1989 .

[21] F.L. Lewis,et al. Nearly optimal HJB solution for constrained input systems using a neural network least-squares approach , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[22] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .

[23] George N. Saridis,et al. An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[24] Rolf Unbehauen,et al. A nonlinear optimal feedback controller using neural networks , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[25] Victor M. Becerra,et al. Optimal control , 2008, Scholarpedia.

[26] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .

[27] F. Lewis,et al. A Hamilton-Jacobi setup for constrained neural network control , 2003, Proceedings of the 2003 IEEE International Symposium on Intelligent Control.

[28] Stuart E. Dreyfus,et al. Applied Dynamic Programming , 1965 .

[29] Y.H. Kim,et al. Hamilton-Jacobi-Bellman optimal design of functional link neural network controller for robot manipulators , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[30] Kumpati S. Narendra,et al. Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[31] Robert Hecht-Nielsen,et al. Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[32] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[33] G. Saridis,et al. Approximate Solutions to the Time-Invariant Hamilton–Jacobi–Bellman Equation , 1998 .

[34] Peter J. Gawthrop,et al. Neural networks for control systems - A survey , 1992, Autom..

[35] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[36] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[38] Rafael Castro-Linares,et al. Trajectory tracking for non-holonomic cars: A linear approach to controlled leader-follower formation , 2010, 49th IEEE Conference on Decision and Control (CDC).