Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators

In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithms. First, we develop a barrier function-based system transformation to impose state constraints while converting the original problem to an unconstrained optimization problem. Second, based on optimal derived policies, two types of intermittent feedback RL algorithms are presented, namely, a static and a dynamic one. We finally leverage an actor/critic structure to solve the problem online while guaranteeing optimality, stability, and safety. Simulation results show the efficacy of the proposed approach.

[1]  Changyin Sun,et al.  An Event-Triggered Approach for Load Frequency Control With Supplementary ADP , 2017, IEEE Transactions on Power Systems.

[2]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[3]  Haibo He,et al.  Power System Stability Control for a Wind Farm Based on Adaptive Dynamic Programming , 2015, IEEE Transactions on Smart Grid.

[4]  W. Ames The Method of Weighted Residuals and Variational Principles. By B. A. Finlayson. Academic Press, 1972. 412 pp. $22.50. , 1973, Journal of Fluid Mechanics.

[5]  Haibo He,et al.  Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints , 2017, IEEE Trans. Neural Networks Learn. Syst..

[6]  Shuzhi Sam Ge,et al.  Adaptive tracking control of uncertain MIMO nonlinear systems with input constraints , 2011, Autom..

[7]  Francis Eng Hock Tay,et al.  Barrier Lyapunov Functions for the control of output-constrained nonlinear systems , 2009, Autom..

[8]  James A. Primbs,et al.  Optimality of nonlinear design techniques: Aconverse HJB approach , 1996 .

[9]  Dong-Juan Li,et al.  Adaptive tracking control for nonlinear time-varying delay systems with full state constraints and unknown control coefficients , 2018, Autom..

[10]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[11]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[12]  Haibo He,et al.  Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Shaocheng Tong,et al.  Adaptive Fuzzy Output Feedback Control for a Class of Nonlinear Systems With Full State Constraints , 2018, IEEE Transactions on Fuzzy Systems.

[14]  Qichao Zhang,et al.  Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Yixin Yin,et al.  Dynamic Intermittent Feedback Design for $H_{\infty}$ Containment Control on a Directed Graph , 2020, IEEE Transactions on Cybernetics.

[16]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[17]  Kyriakos G. Vamvoudakis,et al.  Dynamic intermittent Q ‐learning–based model‐free suboptimal co‐design of ‐stabilization , 2019, International Journal of Robust and Nonlinear Control.

[18]  Haibo He,et al.  Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming , 2017, IEEE Transactions on Industrial Electronics.

[19]  Kyriakos G. Vamvoudakis,et al.  Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Panagiotis D. Christofides,et al.  Distributed model predictive control: A tutorial review and future research directions , 2013, Comput. Chem. Eng..

[21]  Shaocheng Tong,et al.  Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems , 2017, Autom..

[22]  Yixin Yin,et al.  Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Charalampos P. Bechlioulis,et al.  Robust Adaptive Control of Feedback Linearizable MIMO Nonlinear Systems With Prescribed Performance , 2008, IEEE Transactions on Automatic Control.

[24]  Junfei Qiao,et al.  Self-Learning Optimal Regulation for Discrete-Time Nonlinear Systems Under Event-Driven Formulation , 2020, IEEE Transactions on Automatic Control.

[25]  Yixin Yin,et al.  Leader–Follower Output Synchronization of Linear Heterogeneous Systems With Active Leader Using Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Jan Albert Mulder,et al.  Nonlinear Flight Control Design Using Constrained Adaptive Backstepping , 2007 .

[27]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[28]  Yixin Yin,et al.  Optimal Containment Control of Unknown Heterogeneous Systems With Active Leaders , 2019, IEEE Transactions on Control Systems Technology.

[29]  Haibo He,et al.  Q-Learning-Based Vulnerability Analysis of Smart Grid Against Sequential Topology Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[30]  Huanqing Wang,et al.  Adaptive Fuzzy Tracking Control of Flexible-Joint Robots Based on Command Filtering , 2020, IEEE Transactions on Industrial Electronics.

[31]  Haibo He,et al.  Adaptive Critic Learning and Experience Replay for Decentralized Event-Triggered Control of Nonlinear Interconnected Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[32]  Kyriakos G. Vamvoudakis,et al.  Distributed output-feedback model predictive control for multi-agent consensus , 2019, Syst. Control. Lett..

[33]  Qichao Zhang,et al.  Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics , 2019, IEEE Transactions on Cybernetics.

[34]  Yixin Yin,et al.  Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Chun-Yi Su,et al.  Adaptive Neural Network Control for Robotic Manipulators With Unknown Deadzone , 2018, IEEE Transactions on Cybernetics.

[36]  Jing Zhou,et al.  Robust Adaptive Control of Uncertain Nonlinear Systems in the Presence of Input Saturation and External Disturbance , 2011, IEEE Transactions on Automatic Control.

[37]  Shaocheng Tong,et al.  Adaptive control-based Barrier Lyapunov Functions for a class of stochastic nonlinear systems with full state constraints , 2018, Autom..

[38]  Derong Liu,et al.  Neural robust stabilization via event-triggering mechanism and adaptive learning technique , 2018, Neural Networks.

[39]  Haibo He,et al.  GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[41]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Youxian Sun,et al.  Robust ADP Design for Continuous-Time Nonlinear Systems With Output Constraints , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[43]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[44]  Haibo He,et al.  Event-Driven Adaptive Robust Control of Nonlinear Systems With Uncertainties Through NDP Strategy , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[45]  Antoine Girard,et al.  Dynamic Triggering Mechanisms for Event-Triggered Control , 2013, IEEE Transactions on Automatic Control.

[46]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[47]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.