论文信息 - Continuous-Time Safe Learning with Temporal Logic Constraints in Adversarial Environments

Continuous-Time Safe Learning with Temporal Logic Constraints in Adversarial Environments

This paper investigates a safe learning problem that satisfies linear temporal logic (LTL) constraints with persistent adversarial inputs, and quantified performance and robustness. Via a finite state automaton, the LTL specification is first decomposed to a sequence of several two point boundary value problems (TPBVP), each of which has an invariant safety zone. Then we employ a system transformation that guarantees state, and control safety with logarithmic barrier and hyperbolic-type functions as well as a worst-case adversarial input that wants to push the system outside the safety set. A safe learning method is used to solve the sub-problem, where the actors (approximators of the optimal control and the worst- case adversarial inputs) and the critic (approximator of the cost) are tuned to learn the optimal policies without violating any safety. Finally, by following a Lyapunov stability analysis we prove boundedness of the closed-loop system while simulation results are used to validate the effectiveness.

Kyriakos G. Vamvoudakis | Chuangchuang Sun | K. Vamvoudakis | Chuangchuang Sun

[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[2] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[3] Frank L. Lewis,et al. Autonomy and machine intelligence in complex systems: A tutorial , 2015, 2015 American Control Conference (ACC).

[4] Paulo Tabuada,et al. Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[5] Huei Peng,et al. Enhancing the Performance of a Safe Controller Via Supervised Learning for Truck Lateral Control , 2017, Journal of Dynamic Systems, Measurement, and Control.

[6] Francis Eng Hock Tay,et al. Barrier Lyapunov Functions for the control of output-constrained nonlinear systems , 2009, Autom..

[7] Frank L. Lewis,et al. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[8] Stephen Prajna. Barrier certificates for nonlinear model validation , 2006, Autom..

[9] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[10] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[11] Kyriakos G. Vamvoudakis,et al. Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[12] Jaime F. Fisac,et al. Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.

[13] Li Wang,et al. Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation , 2018, ArXiv.

[14] Keng Peng Tee,et al. Adaptive Neural Control for Output Feedback Nonlinear Systems Using a Barrier Lyapunov Function , 2010, IEEE Transactions on Neural Networks.

[15] Frank L. Lewis,et al. Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[16] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .

[17] Warren E. Dixon,et al. Reinforcement Learning for Optimal Feedback Control , 2018 .

[18] J. Primbs,et al. Constrained nonlinear optimal control: a converse HJB approach , 1996 .

[19] F. Lewis,et al. Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[20] Jaime F. Fisac,et al. A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[21] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.