Adaptive optimal tracking control for nonlinear continuous-time systems with time delay using value iteration algorithm

Abstract In this paper, an integral reinforcement learning-based value iteration algorithm is developed for solving the infinite horizon optimal tracking control problem of nonlinear continuous-time systems with time delay. The main idea is using the value iteration technique to obtain the iterative control law, which optimizes the iterative performance index function. In contrast to the existing value iteration algorithms, the proposed IRL-based value iteration algorithm takes the time delay into account. Second, the convergence analysis of the proposed algorithm is given for the nonlinear continuous-time systems with time delay. Moreover, the critic neural network is utilized to approximate the performance index function and compute the optimal control law for facilitating the implementation of the iterative algorithm. Finally, the simulation results are presented to illustrate the effectiveness of the developed method.

[1]  Gong-You Tang,et al.  Approximately optimal tracking control for discrete time-delay systems with disturbances , 2008 .

[2]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  L. C. Baird,et al.  Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[4]  Zhong-Ping Jiang,et al.  Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design , 2016, Autom..

[5]  Derong Liu,et al.  Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach , 2012, Neurocomputing.

[6]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[7]  J. P. Lasalle The stability of dynamical systems , 1976 .

[8]  Warren E. Dixon,et al.  Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..

[9]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[10]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[11]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[12]  J. Zaborszky,et al.  On the phase portrait of a class of large nonlinear dynamic systems such as the power system , 1988 .

[13]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[14]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[15]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[16]  Richard S. Sutton,et al.  Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.

[17]  Derong Liu,et al.  Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors , 2013, Neurocomputing.

[18]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[19]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[20]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[21]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[22]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Adi Ben-Israel,et al.  Generalized inverses: theory and applications , 1974 .

[24]  Jae Young Lee,et al.  A novel generalized value iteration scheme for uncertain continuous-time linear systems , 2010, 49th IEEE Conference on Decision and Control (CDC).

[25]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[26]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  Bo Wang,et al.  New criteria of stability analysis for generalized neural networks subject to time-varying delayed signals , 2017, Appl. Math. Comput..

[28]  Derong Liu,et al.  Data-Driven Neuro-Optimal Temperature Control of Water–Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Industrial Electronics.

[29]  F. Lewis,et al.  Online adaptive algorithm for optimal control with integral reinforcement learning , 2014 .

[30]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[31]  Emilia Fridman,et al.  Introduction to Time-Delay Systems: Analysis and Control , 2014 .

[32]  Chao Li,et al.  Optimal Output Tracking Control for Nonlinear Time-Delay Systems , 2006, 2006 6th World Congress on Intelligent Control and Automation.