On exploration requirements for learning safety constraints

Enforcing safety for dynamical systems is challenging, since it requires constraint satisfaction along trajectory predictions. Equivalent control constraints can be computed in the form of sets that enforce positive invariance, and can thus guarantee safety in feedback controllers without predictions. However, these constraints are cumbersome to compute from models, and it is not yet well established how to infer constraints from data. In this paper, we shed light on the key objects involved in learning control constraints from data in a model-free setting. In particular, we discuss the family of constraints that enforce safety in the context of a nominal control policy, and expose that these constraints do not need to be accurate everywhere. They only need to correctly exclude a subset of the state-actions that would cause failure, which we call the critical set.

[1]  Jennifer C. Shih,et al.  A Framework for Online Updates to Safe Sets for Uncertain Dynamics , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Jaime F. Fisac,et al.  Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[3]  Alexander Herzog,et al.  Walking Control Based on Step Timing Adaptation , 2017, IEEE Transactions on Robotics.

[4]  Paulo Tabuada,et al.  Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[5]  Lukas Hewing,et al.  Learning-Based Model Predictive Control: Toward Safe Learning in Control , 2020, Annu. Rev. Control. Robotics Auton. Syst..

[6]  Jaime F. Fisac,et al.  Planning, Fast and Slow: A Framework for Adaptive Real-Time Safe Trajectory Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[7]  John Lygeros,et al.  Real-Time Control for Autonomous Racing Based on Viability Theory , 2017, IEEE Transactions on Control Systems Technology.

[8]  Jean-Pierre Aubin,et al.  Viability Theory: New Directions , 2011 .

[9]  Ian M. Mitchell,et al.  Computing the viability kernel using maximal reachable sets , 2012, HSCC '12.

[10]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[11]  Mo Chen,et al.  Hamilton-Jacobi reachability: A brief overview and recent advances , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[12]  Martin Buss,et al.  A General Framework to Increase Safety of Learning Algorithms for Dynamical Systems Based on Region of Attraction Estimation , 2020, IEEE Transactions on Robotics.

[13]  Ilya Kolmanovsky,et al.  Reachability and Invariance for Linear Sampled–data Systems , 2017 .

[14]  Sergey Levine,et al.  How to train your robot with deep reinforcement learning: lessons we have learned , 2021, Int. J. Robotics Res..

[15]  Michiel van de Panne,et al.  Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[16]  Nikolai Matni,et al.  Learning Stability Certificates from Data , 2020, CoRL.

[17]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[18]  Pierre-Brice Wieber,et al.  Viability and predictive control for safe locomotion , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Sebastian Trimpe,et al.  Actively Learning Gaussian Process Dynamics , 2019, L4DC.

[20]  Sylvain Calinon,et al.  A Survey on Policy Search Algorithms for Learning Robot Controllers in a Handful of Trials , 2018, IEEE Transactions on Robotics.

[21]  Jaime F. Fisac,et al.  Safely Probabilistically Complete Real-Time Planning and Exploration in Unknown Environments , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Sebastian Trimpe,et al.  A Learnable Safety Measure , 2019, CoRL.

[23]  Dimos V. Dimarogonas,et al.  Learning Control Barrier Functions from Expert Demonstrations , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[24]  Frank Allgöwer,et al.  Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control , 2020, IEEE Robotics and Automation Letters.

[25]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..