Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems

Abstract This paper proposes finite-horizon optimal tracking control approach based on data for completely unknown discrete-time nonlinear affine systems. First, the identifier is designed by input and output data, which is used to identify system function and system model. And based on tracking error, the system function is transformed to the augmentation system with finite-time optimal performance. In finite time, by minimizing the performance index function, the iterative approximate dynamic programming (ADP) is utilized to solve Hamilton–Jacobi–Bellman (HJB) equation. The idea is carried by the policy iterative (PI) based on the model neural network, which makes the iterative control of the augmentation system available at the each step. At the same time, the action neural network is utilized to acquire the approximate optimal tracking control law and the critic neural network is used for approximating the optimal performance index function for the augmentation system. Afterwards, the paper show the analysis process that the convergence and stability for the iterative ADP algorithm and the weight estimation errors based on the PI, respectively. The end of the paper, a simulation example is applied to show the theoretical results and proposed approach.

[1]  Haibo He,et al.  Event-Driven Nonlinear Discounted Optimal Regulation Involving a Power System Application , 2017, IEEE Transactions on Industrial Electronics.

[2]  Huaguang Zhang,et al.  Finite horizon optimal control of non-linear discrete-time switched systems using adaptive dynamic programming with ε-error bound , 2014, Int. J. Syst. Sci..

[3]  Haibo He,et al.  Air-Breathing Hypersonic Vehicle Tracking Control Based on Adaptive Dynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Frank L. Lewis,et al.  H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..

[5]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[6]  Jinyu Wen,et al.  Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Sarangapani Jagannathan,et al.  Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence , 2009, Neural Networks.

[8]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[9]  I. Ha,et al.  Robust tracking in nonlinear systems , 1987 .

[10]  Changyin Sun,et al.  On switching manifold design for terminal sliding mode control , 2016, J. Frankl. Inst..

[11]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[12]  Wei-Song Lin,et al.  Metro Traffic Regulation by Adaptive Optimal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[13]  Robert F. Stengel,et al.  Online Adaptive Critic Flight Control , 2004 .

[14]  Francesco Amato,et al.  Finite-time control of discrete-time linear systems , 2005, IEEE Transactions on Automatic Control.

[15]  Huaguang Zhang,et al.  Multi-objective optimal control for a class of unknown nonlinear systems based on finite-approximation-error ADP algorithm , 2013, Neurocomputing.

[16]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  B. Paden,et al.  Nonlinear inversion-based output tracking , 1996, IEEE Trans. Autom. Control..

[18]  R. Freeman,et al.  Robust Nonlinear Control Design: State-Space and Lyapunov Techniques , 1996 .

[19]  Marcelo D. Fragoso,et al.  A Separation Principle for the Continuous-Time LQ-Problem With Markovian Jump Parameters , 2010, IEEE Transactions on Automatic Control.

[20]  Derong Liu,et al.  Learning and Guaranteed Cost Control With Event-Based Adaptive Critic Implementation , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[22]  Chaomin Luo,et al.  Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.

[23]  R. Bellman Dynamic programming. , 1957, Science.

[24]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[25]  Frank L. Lewis,et al.  Optimized Assistive Human–Robot Interaction Using Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[26]  Lei Yang,et al.  Direct Heuristic Dynamic Programming for Nonlinear Tracking Control With Filtered Tracking Error , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  V. Haimo Finite time controllers , 1986 .

[28]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[29]  Derong Liu,et al.  Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming , 2012, IEEE Transactions on Automation Science and Engineering.

[30]  Derong Liu,et al.  Finite-Approximation-Error-Based Optimal Control Approach for Discrete-Time Nonlinear Systems , 2013, IEEE Transactions on Cybernetics.

[31]  Derong Liu,et al.  Decentralized guaranteed cost control of interconnected systems with uncertainties: A learning-based optimal control strategy , 2016, Neurocomputing.

[32]  Bin Jiang,et al.  Online Adaptive Policy Learning Algorithm for $H_{\infty }$ State Feedback Control of Unknown Affine Nonlinear Discrete-Time Systems , 2014, IEEE Transactions on Cybernetics.

[33]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  Ali Heydari,et al.  Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Derong Liu,et al.  Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach , 2012, Neurocomputing.

[36]  Elena Zattoni Structural Invariant Subspaces of Singular Hamiltonian Systems and Nonrecursive Solutions of Finite-Horizon Optimal Control Problems , 2008, IEEE Transactions on Automatic Control.

[37]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[38]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[39]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[40]  Frank L. Lewis,et al.  Nearly optimal state feedback control of constrained nonlinear systems using a neural networks HJB approach , 2004, Annu. Rev. Control..