Safe Intermittent Reinforcement Learning for Nonlinear Systems

In this paper, an online intermittent actor-critic reinforcement learning method is used to stabilize nonlinear systems optimally while also guaranteeing safety. A barrier function-based transformation is introduced to ensure that the system does not violate the user-defined safety constraints. It is shown that the safety constraints of the original system can be guaranteed by assuring the stability of the equilibrium point of an appropriately transformed system. Then, an online intermittent actor-critic learning framework is developed to learn the optimal safe intermittent controller. Also, Zeno behavior is guaranteed to be excluded. Finally, numerical examples are conducted to verify the efficacy of the learning algorithm.

[1]  Victor M. Becerra,et al.  Optimal control , 2008, Scholarpedia.

[2]  Ehsan Arabi,et al.  A set-theoretic model reference adaptive control architecture for disturbance rejection and uncertainty suppression with strict performance guarantees , 2018, Int. J. Control.

[3]  J. Corriou Chapter 12 – Nonlinear Control , 2017 .

[4]  Yixin Yin,et al.  Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Qing-Long Han,et al.  Distributed networked control systems: A brief overview , 2017, Inf. Sci..

[6]  Francis Eng Hock Tay,et al.  Barrier Lyapunov Functions for the control of output-constrained nonlinear systems , 2009, Autom..

[7]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[8]  Kyriakos G. Vamvoudakis,et al.  Control of Complex Systems : Theory and Applications , 2016 .

[9]  Ehsan Arabi,et al.  Set-theoretic model reference adaptive control with time-varying performance bounds , 2019, Int. J. Control.

[10]  Sandra Hirche,et al.  On the Optimality of Certainty Equivalence for Event-Triggered Control Systems , 2013, IEEE Transactions on Automatic Control.

[11]  Yixin Yin,et al.  Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Shaocheng Tong,et al.  Barrier Lyapunov Functions-based adaptive control for a class of nonlinear pure-feedback systems with full state constraints , 2016, Autom..

[13]  C. L. Philip Chen,et al.  A survey of human-centered intelligent robots: issues and challenges , 2017, IEEE/CAA Journal of Automatica Sinica.

[14]  Yixin Yin,et al.  Optimal Containment Control of Unknown Heterogeneous Systems With Active Leaders , 2019, IEEE Transactions on Control Systems Technology.

[15]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Kyriakos G. Vamvoudakis,et al.  Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance , 2018, Autom..

[17]  Yixin Yin,et al.  Dynamic Intermittent Feedback Design for $H_{\infty}$ Containment Control on a Directed Graph , 2020, IEEE Transactions on Cybernetics.

[18]  Paulo Tabuada,et al.  An introduction to event-triggered and self-triggered control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[19]  Frank L. Lewis,et al.  Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles , 2012 .

[20]  Keng Peng Tee,et al.  Adaptive Neural Control for Output Feedback Nonlinear Systems Using a Barrier Lyapunov Function , 2010, IEEE Transactions on Neural Networks.

[21]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Paulo Tabuada,et al.  A Framework for the Event-Triggered Stabilization of Nonlinear Systems , 2015, IEEE Transactions on Automatic Control.

[23]  Shaocheng Tong,et al.  Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems , 2017, Autom..

[24]  K. Vamvoudakis,et al.  Event‐triggered optimal tracking control of nonlinear systems , 2017 .

[25]  Yixin Yin,et al.  Leader–Follower Output Synchronization of Linear Heterogeneous Systems With Active Leader Using Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Kyriakos G. Vamvoudakis,et al.  Dynamic intermittent Q ‐learning–based model‐free suboptimal co‐design of ‐stabilization , 2019, International Journal of Robust and Nonlinear Control.

[27]  Kyriakos G. Vamvoudakis,et al.  Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[28]  J. Primbs,et al.  Constrained nonlinear optimal control: a converse HJB approach , 1996 .