Safe Reinforcement Learning Using Robust Action Governor

Reinforcement Learning (RL) is essentially a trial-and-error learning procedure which may cause unsafe behavior during the exploration-and-exploitation process. This hinders the application of RL to real-world control problems, especially to those for safety-critical systems. In this paper, we introduce a framework for safe RL that is based on integration of a RL algorithm with an add-on safety supervision module, called the Robust Action Governor (RAG), which exploits set-theoretic techniques and online optimization to manage safety-related requirements during learning. We illustrate this proposed safe RL framework through an application to automotive adaptive cruise control.

[1]  Yisong Yue,et al.  Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes , 2018, AAAI.

[2]  Manfred Morari,et al.  Multi-Parametric Toolbox 3.0 , 2013, 2013 European Control Conference (ECC).

[3]  Pieter Abbeel,et al.  Safe Exploration in Markov Decision Processes , 2012, ICML.

[4]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5]  Tianshu Chu,et al.  Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set , 2018, 2018 Annual American Control Conference (ACC).

[6]  Tobias Achterberg,et al.  SCIP: solving constraint integer programs , 2009, Math. Program. Comput..

[7]  Andrea Carron,et al.  Safe Learning for Distributed Systems with Bounded Uncertainties , 2017 .

[8]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Gábor Orosz,et al.  End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[11]  Rafael Wisniewski,et al.  Compositional safety analysis using barrier certificates , 2012, HSCC '12.

[12]  Fritz Wysotzki,et al.  Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..

[13]  Tianshu Chu,et al.  Dynamics-Enabled Safe Deep Reinforcement Learning: Case Study on Active Suspension Control , 2019, 2019 IEEE Conference on Control Technology and Applications (CCTA).

[14]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[15]  Jonathan Currie,et al.  Opti: Lowering the Barrier Between Open Source Optimizers and the Industrial MATLAB User , 2012 .

[16]  Nan Li,et al.  Action Governor for Discrete-Time Linear Systems With Non-Convex Constraints , 2020, IEEE Control Systems Letters.

[17]  Anouck Girard,et al.  Robust Action Governor for Discrete-Time Piecewise Affine Systems With Additive Disturbances , 2021, IEEE Control Systems Letters.

[18]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[19]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[20]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[21]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.