论文信息 - A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints

A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints

In this paper, a novel neural-network-based iterative adaptive dynamic programming (ADP) algorithm is proposed. It aims at solving the optimal control problem of a class of nonlinear discrete-time systems with control constraints. By introducing a generalized nonquadratic functional, the iterative ADP algorithm through globalized dual heuristic programming technique is developed to design optimal controller with convergence analysis. Three neural networks are constructed as parametric structures to facilitate the implementation of the iterative algorithm. They are used for approximating at each iteration the cost function, the optimal control law, and the controlled nonlinear discrete-time system, respectively. A simulation example is also provided to verify the effectiveness of the control scheme in solving the constrained optimal control problem.

Derong Liu | Dongbin Zhao | Yuzhu Huang | Ding Wang | Dehua Zhang

[1] Paul J. Werbos,et al. 2009 Special Issue: Intelligence in the brain: A theory of how it works and how to build it , 2009 .

[2] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .

[3] Huaguang Zhang,et al. Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming , 2010, Neurocomputing.

[4] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[5] Derong Liu,et al. e-Adaptive Dynamic Programming for discrete-time systems. , 2008 .

[6] Huaguang Zhang,et al. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[7] Jennie Si,et al. Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[8] Paul J. Werbos,et al. Foreword: ADP - The Key Direction for Future Research in Intelligent Control and Understanding Brain Intelligence , 2008, IEEE Trans. Syst. Man Cybern. Part B.

[9] Derong Liu,et al. Action-dependent adaptive critic designs , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[10] Huaguang Zhang,et al. Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[11] Ben J. A. Kröse,et al. Neural Q-learning , 2003, Neural Computing & Applications.

[12] Frank L. Lewis,et al. A Neural Network Solution for Fixed-Final Time Optimal Control of Nonlinear Systems , 2006, 2006 14th Mediterranean Conference on Control and Automation.

[13] S. Lyshevski. Nonlinear discrete-time systems: constrained optimization and application of nonquadratic costs , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[14] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15] S. N. Balakrishnan,et al. Adaptive-critic based neural networks for aircraft optimal control , 1996 .

[16] Derong Liu,et al. Neurodynamic programming: a case study of the traveling salesman problem , 2008, Neural Computing and Applications.

[17] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[18] Wen Yu. Recent Advances in Intelligent Control Systems , 2009 .

[19] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[20] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[21] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[22] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[23] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[24] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[25] Gary G. Yen,et al. Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor , 2005, IEEE Transactions on Automation Science and Engineering.

[26] S. Lyashevskiy. Constrained optimization and control of nonlinear systems: new results in optimal control , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[27] Dingguo Chen,et al. On near optimal neural control of multiple-input nonlinear systems , 2007, Neural Computing and Applications.

[28] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[29] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30] Huaguang Zhang,et al. A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31] Jagannathan Sarangapani,et al. Neural Network Control of Nonlinear Discrete-Time Systems , 2018 .

[32] S. N. Balakrishnan,et al. State-constrained agile missile control with adaptive-critic-based neural networks , 2002, IEEE Trans. Control. Syst. Technol..

[33] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[34] Huaguang Zhang,et al. A Neural Dynamic Programming Approach F or Learning Control O f Failure Avoidance Problems , 2005 .