Neuro-Optimal Trajectory Tracking With Value Iteration of Discrete-Time Nonlinear Dynamics

In this article, a novel neuro-optimal tracking control approach is developed toward discrete-time nonlinear systems. By constructing a new augmented plant, the optimal trajectory tracking design is transformed into an optimal regulation problem. For discrete-time nonlinear dynamics, the steady control input corresponding to the reference trajectory is given. Then, the value-iteration-based tracking control algorithm is provided and the convergence of the value function sequence is established. Therein, the approximation error between the iterative value function and the optimal cost is estimated. The uniformly ultimately bounded stability of the closed-loop system is also discussed in detail. Moreover, the iterative heuristic dynamic programming (HDP) algorithm is implemented by involving the critic and action components, where some new updating rules of the action network are provided. Finally, two examples are used to demonstrate the optimality of the present controller as well as the effectiveness of the proposed method.

[1]  Derong Liu,et al.  Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration , 2021, IEEE Transactions on Cybernetics.

[2]  Junfei Qiao,et al.  Data-Driven Iterative Adaptive Critic Control Toward an Urban Wastewater Treatment Plant , 2021, IEEE Transactions on Industrial Electronics.

[3]  Ding Wang,et al.  Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application , 2021, Neural Networks.

[4]  B. Xu,et al.  H∞ Codesign for Uncertain Nonlinear Control Systems Based on Policy Iteration Method , 2021, IEEE Transactions on Cybernetics.

[5]  Zhong-Ping Jiang,et al.  Reinforcement Learning-Based Cooperative Optimal Output Regulation via Distributed Adaptive Internal Model , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Dongbin Zhao,et al.  Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Xiao Han,et al.  Online policy iteration ADP-based attitude-tracking control for hypersonic vehicles , 2020 .

[8]  Tianyou Chai,et al.  Adaptive Interleaved Reinforcement Learning: Robust Stability of Affine Nonlinear Systems With Unknown Uncertainty , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Derong Liu,et al.  Event-Triggered Adaptive Critic Control Design for Discrete-Time Constrained Nonlinear Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[10]  Lei Liu,et al.  Actuator Failure Compensation-Based Adaptive Control of Active Suspension Systems With Prescribed Performance , 2020, IEEE Transactions on Industrial Electronics.

[11]  Ding Wang,et al.  Intelligent Critic Control With Robustness Guarantee of Disturbed Nonlinear Plants , 2020, IEEE Transactions on Cybernetics.

[12]  Long Cheng,et al.  Self-Learning Robust Control Synthesis and Trajectory Tracking of Uncertain Dynamics , 2020, IEEE Transactions on Cybernetics.

[13]  Junfei Qiao,et al.  An Approximate Neuro-Optimal Solution of Discounted Guaranteed Cost Control Design , 2020, IEEE Transactions on Cybernetics.

[14]  Junfei Qiao,et al.  Self-Learning Optimal Regulation for Discrete-Time Nonlinear Systems Under Event-Driven Formulation , 2020, IEEE Transactions on Automatic Control.

[15]  Haibo He,et al.  Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning , 2019, IEEE Transactions on Cybernetics.

[16]  Junfei Qiao,et al.  Approximate neural optimal control with reinforcement learning for a torsional pendulum device , 2019, Neural Networks.

[17]  Haibo He,et al.  Approximate Dynamic Programming for Nonlinear-Constrained Optimizations , 2019, IEEE Transactions on Cybernetics.

[18]  Yang Liu,et al.  A Rapid Spiking Neural Network Approach With an Application on Hand Gesture Recognition , 2019, IEEE Transactions on Cognitive and Developmental Systems.

[19]  He Jiang,et al.  Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via Adaptive Dynamic Programming , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[20]  Kun Zhang,et al.  Adaptive Fuzzy Fault-Tolerant Tracking Control for Partially Unknown Systems With Actuator Faults via Integral Reinforcement Learning Method , 2019, IEEE Transactions on Fuzzy Systems.

[21]  Guang-Hong Yang,et al.  Policy iteration based robust co-design for nonlinear control systems with state constraints , 2018, Inf. Sci..

[22]  Ali Heydari,et al.  Stability Analysis of Optimal Adaptive Control Under Value Iteration Using a Stabilizing Initial Policy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Frank L. Lewis,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[24]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Huaguang Zhang,et al.  Neural-Network-Based Robust Optimal Tracking Control for MIMO Discrete-Time Systems With Unknown Uncertainty Using Adaptive Critic Design , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Frank L. Lewis,et al.  Leader-to-Formation Stability of Multiagent Systems: An Adaptive Optimal Control Approach , 2018, IEEE Transactions on Automatic Control.

[27]  Huaguang Zhang,et al.  Iterative ADP learning algorithms for discrete-time multi-player games , 2018, Artificial Intelligence Review.

[28]  AbdollahiFarzaneh,et al.  Adaptive near-optimal neuro controller for continuous-time nonaffine nonlinear systems with constrained input , 2017 .

[29]  Heidar Ali Talebi,et al.  Adaptive near-optimal neuro controller for continuous-time nonaffine nonlinear systems with constrained input , 2017, Neural Networks.

[30]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[31]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[32]  Dongbin Zhao,et al.  Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics , 2016 .

[33]  Tingwen Huang,et al.  Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Frank L. Lewis,et al.  Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data , 2015, IEEE Transactions on Cybernetics.

[35]  Ali Heydari,et al.  Theoretical and Numerical Analysis of Approximate Dynamic Programming with Approximation Errors , 2014, ArXiv.

[36]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[37]  Ali Heydari,et al.  Revisiting Approximate Dynamic Programming and its Convergence , 2014, IEEE Transactions on Cybernetics.

[38]  Huaguang Zhang,et al.  Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming , 2014, Int. J. Control.

[39]  A. Heydari,et al.  Adaptive Critic-Based Solution to an Orbital Rendezvous Problem , 2014 .

[40]  Warren E. Dixon,et al.  Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..

[41]  Derong Liu,et al.  Optimal control for discrete-time affine non-linear systems using general value iteration , 2012 .

[42]  Derong Liu,et al.  Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach , 2012, Neurocomputing.

[43]  Haibo He,et al.  Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[44]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  A. Rantzer Relaxed dynamic programming in switching systems , 2006 .

[47]  Bo Lincoln,et al.  Relaxing dynamic programming , 2006, IEEE Transactions on Automatic Control.

[48]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[49]  Frank L. Lewis,et al.  A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems , 2021, Autom..

[50]  Derong Liu,et al.  Generalized value iteration for discounted optimal control with stability analysis , 2021, Syst. Control. Lett..

[51]  Shaocheng Tong,et al.  Adaptive control-based Barrier Lyapunov Functions for a class of stochastic nonlinear systems with full state constraints , 2018, Autom..

[52]  Frank L. Lewis,et al.  Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Qinglai Wei,et al.  Neuro-Optimal Control of Unknown Nonaffine Nonlinear Systems with Saturating Actuators , 2013, ICONS.

[54]  T.,et al.  Training Feedforward Networks with the Marquardt Algorithm , 2004 .