Optimal neuro-control strategy for nonlinear systems with asymmetric input constraints

In this paper, we present an optimal neuro-control scheme for continuous-time ( CT ) nonlinear systems with asymmetric input constraints. Initially, we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints. Then, we develop a Hamilton-Jacobi-Bellman equation ( HJBE ) , which arises in the discounted cost optimal control problem. To obtain the optimal neurocontroller, we utilize a critic neural network ( CNN ) to solve the HJBE under the framework of reinforcement learning. The CNN ʼ s weight vector is tuned via the gradient descent approach. Based on the Lyapunov method, we prove that uniform ultimate boundedness of the CNNʼ s weight vector and the closed-loop system is guaranteed. Finally, we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.

[1]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[2]  Haibo He,et al.  Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances , 2018, Neural Networks.

[3]  Xiangnan Zhong,et al.  Advanced policy learning near-optimal regulation , 2019, IEEE/CAA Journal of Automatica Sinica.

[4]  Huaguang Zhang,et al.  Adaptive Fault-Tolerant Tracking Control for MIMO Discrete-Time Systems via Reinforcement Learning Algorithm With Less Learning Parameters , 2017, IEEE Transactions on Automation Science and Engineering.

[5]  Shaocheng Tong,et al.  Adaptive Reinforcement Learning Control Based on Neural Approximation for Nonlinear Discrete-Time Systems With Unknown Nonaffine Dead-Zone Input , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Radhakant Padhi,et al.  A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems , 2006, Neural Networks.

[7]  Jing Na,et al.  Observer-based adaptive optimal control for unknown singularly perturbed nonlinear systems with input constraints , 2017, IEEE/CAA Journal of Automatica Sinica.

[8]  Frank L. Lewis,et al.  Dual-Rate Operational Optimal Control for Flotation Industrial Process With Unknown Operational Model , 2019, IEEE Transactions on Industrial Electronics.

[9]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[11]  Frank L. Lewis,et al.  Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Yu Liu,et al.  Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming , 2017, IEEE/CAA Journal of Automatica Sinica.

[13]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[14]  Derong Liu,et al.  Event-Triggered Decentralized Tracking Control of Modular Reconfigurable Robots Through Adaptive Dynamic Programming , 2020, IEEE Transactions on Industrial Electronics.

[15]  Derong Liu,et al.  Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints , 2015, IEEE Transactions on Cybernetics.

[16]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[17]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[18]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[19]  Wei Xing Zheng,et al.  Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Haibo He,et al.  Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints , 2017, IEEE Trans. Neural Networks Learn. Syst..

[21]  Kun Zhang,et al.  Robust Optimal Control Scheme for Unknown Constrained-Input Nonlinear Systems via a Plug-n-Play Event-Sampled Critic-Only Algorithm , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[22]  Kun Zhang,et al.  Adaptive Fuzzy Fault-Tolerant Tracking Control for Partially Unknown Systems With Actuator Faults via Integral Reinforcement Learning Method , 2019, IEEE Transactions on Fuzzy Systems.

[23]  Huaguang Zhang,et al.  Neural-Network-Based Robust Optimal Tracking Control for MIMO Discrete-Time Systems With Unknown Uncertainty Using Adaptive Critic Design , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Haibo He,et al.  Adaptive Critic Learning and Experience Replay for Decentralized Event-Triggered Control of Nonlinear Interconnected Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[25]  Derong Liu,et al.  Event-Triggered Adaptive Critic Control Design for Discrete-Time Constrained Nonlinear Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26]  Haibo He,et al.  Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming , 2017, IEEE Transactions on Industrial Electronics.

[27]  Haibo He,et al.  Neuro-Optimal Tracking Control for Continuous Stirred Tank Reactor With Input Constraints , 2019, IEEE Transactions on Industrial Informatics.

[28]  Tingwen Huang,et al.  Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Xiong Yang,et al.  Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems , 2016, Inf. Sci..

[30]  Chenguang Yang,et al.  Asymmetric Bounded Neural Control for an Uncertain Robot by State Feedback and Output Feedback , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[31]  Derong Liu,et al.  Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[32]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[33]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[34]  Frank L. Lewis,et al.  Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Derong Liu,et al.  Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning , 2014, Neural Networks.

[36]  Naresh Malla,et al.  Prioritizing Useful Experience Replay for Heuristic Dynamic Programming-Based Learning Systems , 2019, IEEE Transactions on Cybernetics.

[37]  Haibo He,et al.  Event-Triggered Robust Stabilization of Nonlinear Input-Constrained Systems Using Single Network Adaptive Critic Designs , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[38]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming , 2017 .

[39]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[40]  Huaguang Zhang,et al.  Optimal Fault-Tolerant Control for Discrete-Time Nonlinear Strict-Feedback Systems Based on Adaptive Critic Design , 2018, IEEE Transactions on Neural Networks and Learning Systems.