Optimal tracking control based on reinforcement learning value iteration algorithm for time-delayed nonlinear systems with external disturbances and input constraints

Abstract This article investigates the design of an optimal tracking controller for a class of nonlinear continuous-time systems with time-delay, mismatched external disturbances and input constraints. The technique of integral reinforcement learning (IRL) is utilized for determining the control input that optimizes an objective function. To enable the usage of an estimation of the external disturbances in the recursive objective function, a disturbance observer is designed. For the derivation of the optimal control input, a Hamilton-Jacobi-Bellman (HJB) equation is employed and solved using the iterative IRL algorithm. The proposed approach guarantees that in the presence of mismatched disturbances, the output of the time-delayed nonlinear system tracks the desired trajectory with bounded error. A critic neural network is designed for the implementation of the proposed approach. The efficiency of the method is illustrated by a simulation example.

[1]  Gong-You Tang,et al.  Approximately optimal tracking control for discrete time-delay systems with disturbances , 2008 .

[2]  Haibo He,et al.  Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances , 2018, Neural Networks.

[3]  J. Zaborszky,et al.  On the phase portrait of a class of large nonlinear dynamic systems such as the power system , 1988 .

[4]  Keng Peng Tee,et al.  Approximation-based control of nonlinear MIMO time-delay systems , 2007, Autom..

[5]  Haibo He,et al.  Intelligent Critic Control With Disturbance Attenuation for Affine Dynamics Including an Application to a Microgrid System , 2017, IEEE Transactions on Industrial Electronics.

[6]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[7]  Huaguang Zhang,et al.  Neural network-based online H∞ control for discrete-time affine nonlinear system using adaptive dynamic programming , 2016, Neurocomputing.

[8]  Dong Yue,et al.  Adaptive optimal tracking control for nonlinear continuous-time systems with time delay using value iteration algorithm , 2020, Neurocomputing.

[9]  Jae Young Lee,et al.  Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[11]  Yang Liu,et al.  Neural network-based H∞ sliding mode control for nonlinear systems with actuator faults and unmatched disturbances , 2018, Neurocomputing.

[12]  Yi‐Shyong Chou,et al.  Output tracking control of uncertain nonlinear systems with an input time delay , 1996 .

[13]  Shuzhi Sam Ge,et al.  Adaptive neural network control of nonlinear systems with unknown time delays , 2003, IEEE Trans. Autom. Control..

[14]  Derong Liu,et al.  Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors , 2013, Neurocomputing.

[15]  Shengyuan Xu,et al.  Exact tracking control of nonlinear systems with time delays and dead-zone input , 2015, Autom..

[16]  Emilia Fridman,et al.  Introduction to Time-Delay Systems: Analysis and Control , 2014 .

[17]  Jin Bae Park,et al.  Adaptive Neural Control for a Class of Strict-Feedback Nonlinear Systems With State Time Delays , 2009, IEEE Transactions on Neural Networks.

[18]  Peter J. Gawthrop,et al.  A nonlinear disturbance observer for robotic manipulators , 2000, IEEE Trans. Ind. Electron..

[19]  Warren E. Dixon,et al.  Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..

[20]  Huaguang Zhang,et al.  Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming , 2014, Int. J. Control.

[21]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[22]  Frank L. Lewis,et al.  Robust optimal control for a class of nonlinear systems with unknown disturbances based on disturbance observer and policy iteration , 2020, Neurocomputing.

[23]  A. Molabahrami Integral mean value method for solving a general nonlinear Fredholm integro-differential equation under the mixed conditions , 2013 .

[24]  Hamidreza Modares,et al.  Distributed neuro-adaptive control protocols for non-strict feedback non-linear MASs with input saturation , 2018 .

[25]  Chun-Yi Su,et al.  Adaptive tracking of nonlinear systems with non-symmetric dead-zone input , 2007, Autom..

[26]  Shaocheng Tong,et al.  Adaptive fuzzy backstepping output feedback control for a class of MIMO time-delay nonlinear systems based on high-gain observer , 2011, Nonlinear Dynamics.

[27]  Xiong Yang,et al.  Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints , 2014, Int. J. Control.

[28]  Frank L. Lewis,et al.  Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances , 2016, IEEE Transactions on Cybernetics.

[29]  Huaguang Zhang,et al.  Neural-Network-Based Constrained Optimal Control Scheme for Discrete-Time Switched Nonlinear System Using Dual Heuristic Programming , 2014, IEEE Transactions on Automation Science and Engineering.

[30]  Lei Yang,et al.  Direct Heuristic Dynamic Programming for Nonlinear Tracking Control With Filtered Tracking Error , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[32]  Derong Liu,et al.  Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints , 2013 .

[33]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[34]  Bo Zhao,et al.  Asymptotically stable critic designs for approximate optimal stabilization of nonlinear systems subject to mismatched external disturbances , 2020, Neurocomputing.

[35]  Yuchen Jiang,et al.  Adaptive Fuzzy Fault-Tolerant Control for Markov Jump Systems With Additive and Multiplicative Actuator Faults , 2021, IEEE Transactions on Fuzzy Systems.

[36]  Mohammad Ataei,et al.  Continuous nonsingular terminal sliding mode control based on adaptive sliding mode disturbance observer for uncertain nonlinear systems , 2019, Autom..

[37]  Jianbin Qiu,et al.  An Adaptive NN-Based Approach for Fault-Tolerant Control of Nonlinear Time-Varying Delay Systems With Unmodeled Dynamics , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Zhongke Gao,et al.  Event-driven H∞ control with critic learning for nonlinear systems , 2020, Neural Networks.

[39]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[40]  Shihua Li,et al.  Non-linear disturbance observer-based robust control for systems with mismatched disturbances/uncertainties , 2011 .

[41]  Jian Huang,et al.  Nonlinear Disturbance Observer-Based Dynamic Surface Control for Trajectory Tracking of Pneumatic Muscle System , 2014, IEEE Transactions on Control Systems Technology.

[42]  Xinjiang Wei,et al.  Composite disturbance‐observer‐based control and H ∞ control for nonlinear time‐delay systems , 2009 .

[43]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[44]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[45]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[46]  Bin Jiang,et al.  Online Adaptive Policy Learning Algorithm for $H_{\infty }$ State Feedback Control of Unknown Affine Nonlinear Discrete-Time Systems , 2014, IEEE Transactions on Cybernetics.

[47]  Peng Yan,et al.  A disturbance observer-based adaptive control approach for flexure beam nano manipulators. , 2016, ISA transactions.

[48]  Bing Chen,et al.  Novel adaptive neural control design for nonlinear MIMO time-delay systems , 2009, Autom..

[49]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..