Event-Triggered Optimal Neuro-Controller Design With Reinforcement Learning for Unknown Nonlinear Systems

This paper develops an optimal control scheme for continuous-time unknown nonlinear systems using the event-triggering mechanism. Different from designing controllers using the time-triggering mechanism, the event-triggered controller is updated only when the system state deviates more than a certain threshold from a prescribed value. To obtain the event-triggered optimal controller, we develop an identifier-critic architecture under the framework of reinforcement learning. The identifier network, composed of a feedforward neural network (FNN), aims to derive the knowledge of unknown system dynamics, and the critic network, constituted of an FNN, intends to derive the event-triggered optimal controller. The identifier network is tuned via the combination of a standard back-propagation algorithm and an ${e}$ -modification method, and the critic network is updated using a modification of the gradient descent method. By introducing an additional stability term to update the critic network, the initial admissible control is no longer required. Meanwhile, by using historical and instantaneous state data together, the persistence of excitation condition is relaxed. A stability analysis of the closed-loop system is provided based on the Lyapunov method. The effectiveness of the proposed designs is illustrated through simulations of a nonlinear example and a single link robot arm system.

[1]  K. Vamvoudakis Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems , 2014, IEEE/CAA Journal of Automatica Sinica.

[2]  Zhong-Ping Jiang,et al.  Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming , 2016, Autom..

[3]  Derong Liu,et al.  Wavelet Basis Function Neural Networks for Sequential Learning , 2008, IEEE Transactions on Neural Networks.

[4]  Derong Liu,et al.  Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints , 2015, IEEE Transactions on Cybernetics.

[5]  S. Jagannathan,et al.  Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation , 2010, 49th IEEE Conference on Decision and Control (CDC).

[6]  Yongming Li,et al.  Adaptive output-feedback control design with prescribed performance for switched nonlinear systems , 2017, Autom..

[7]  Derong Liu,et al.  Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems , 2017 .

[8]  Manish Sharma,et al.  Wavelet Neural Network Observer Based Adaptive Tracking Control for a Class of Uncertain Nonlinear Delayed Systems Using Reinforcement Learning , 2012 .

[9]  Qichao Zhang,et al.  Event-Triggered $H_\infty $ Control for Continuous-Time Nonlinear System via Concurrent Learning , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[10]  Jae Young Lee,et al.  Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[12]  Huaguang Zhang,et al.  Fault-Tolerant Controller Design for a Class of Nonlinear MIMO Discrete-Time Systems via Online Reinforcement Learning Algorithm , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[13]  Changyin Sun,et al.  A new self-learning optimal control laws for a class of discrete-time nonlinear systems based on ESN architecture , 2014, Science China Information Sciences.

[14]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[15]  Michael L. Littman,et al.  Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.

[16]  Huaguang Zhang,et al.  Adaptive Dynamic Programming for a Class of Complex-Valued Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  W. Rudin Principles of mathematical analysis , 1964 .

[18]  Manish Sharma,et al.  Wavelet reduced order observer-based adaptive tracking control for a class of uncertain delayed non-linear systems subjected to actuator saturation using actor-critic architecture , 2013, Int. J. Autom. Control..

[19]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[20]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[21]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[22]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[23]  Shaocheng Tong,et al.  Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Haibo He,et al.  Improving the Critic Learning for Event-Based Nonlinear $H_{\infty }$ Control Design , 2017, IEEE Transactions on Cybernetics.

[25]  Paulo Tabuada,et al.  An introduction to event-triggered and self-triggered control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[26]  K. Narendra,et al.  A New Adaptive Law for Robust Adaptation without Persistent Excitation , 1986, 1986 American Control Conference.

[27]  G. Chowdhary,et al.  A singular value maximizing data recording algorithm for concurrent learning , 2011, Proceedings of the 2011 American Control Conference.

[28]  Tingwen Huang,et al.  Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[29]  Haibo He,et al.  Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming , 2017, IEEE Transactions on Industrial Electronics.

[30]  Xiangnan Zhong,et al.  An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States. , 2017 .

[31]  Qichao Zhang,et al.  Event-Triggered H ∞ Control for Continuous-Time Nonlinear System , 2015, ISNN.

[32]  Haibo He,et al.  Adaptive Event-Triggered Control Based on Heuristic Dynamic Programming for Nonlinear Discrete-Time Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Avimanyu Sahoo,et al.  Approximate Optimal Control of Affine Nonlinear Continuous-Time Systems Using Event-Sampled Neurodynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Ajay Verma,et al.  Wavelet reduced order observer based adaptive tracking control for a class of uncertain nonlinear systems using reinforcement learning , 2013 .

[35]  Shaocheng Tong,et al.  Observer-Based Adaptive Fuzzy Decentralized Optimal Control Design for Strict-Feedback Nonlinear Large-Scale Systems , 2018, IEEE Transactions on Fuzzy Systems.

[36]  Haibo He,et al.  Event-Driven Adaptive Robust Control of Nonlinear Systems With Uncertainties Through NDP Strategy , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[37]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[38]  Derong Liu,et al.  Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming , 2016, Neurocomputing.

[39]  Shaocheng Tong,et al.  Fuzzy Adaptive Decentralized Optimal Control for Strict Feedback Nonlinear Large-Scale Systems , 2018, IEEE Transactions on Cybernetics.

[40]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[41]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[42]  Avimanyu Sahoo,et al.  Near Optimal Event-Triggered Control of Nonlinear Discrete-Time Systems Using Neurodynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[43]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[44]  Derong Liu,et al.  On Mixed Data and Event Driven Design for Adaptive-Critic-Based Nonlinear $H_{\infty}$ Control , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Chaomin Luo,et al.  Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.

[46]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[47]  Warren E. Dixon,et al.  Model-based reinforcement learning for infinite-horizon approximate optimal tracking , 2014, 53rd IEEE Conference on Decision and Control.

[48]  Yu Liu,et al.  Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming , 2017, IEEE/CAA Journal of Automatica Sinica.

[49]  Shaocheng Tong,et al.  Fuzzy Adaptive Control Design Strategy of Nonlinear Switched Large-Scale Systems , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.