Learning for Safety-Critical Control with Control Barrier Functions

Modern nonlinear control theory seeks to endow systems with properties of stability and safety, and have been deployed successfully in multiple domains. Despite this success, model uncertainty remains a significant challenge in synthesizing safe controllers, leading to degradation in the properties provided by the controllers. This paper develops a machine learning framework utilizing Control Barrier Functions (CBFs) to reduce model uncertainty as it impact the safe behavior of a system. This approach iteratively collects data and updates a controller, ultimately achieving safe behavior. We validate this method in simulation and experimentally on a Segway platform.

[1]  Ofir Nachum,et al.  A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.

[2]  Koushil Sreenath,et al.  Exponential Control Barrier Functions for enforcing high relative-degree safety-critical constraints , 2016, 2016 American Control Conference (ACC).

[3]  Paulo Tabuada,et al.  Robustness of Control Barrier Functions for Safety Critical Control , 2016, ADHS.

[4]  Paulo Tabuada,et al.  Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[5]  Andreas Krause,et al.  The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems , 2018, CoRL.

[6]  Gábor Orosz,et al.  End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[7]  Evangelos Theodorou,et al.  Bayesian Learning-Based Adaptive Control for Safety Critical Systems , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Aaron D. Ames,et al.  Towards a Framework for Realizable Safety Critical Control through Active Set Invariance , 2018, 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems (ICCPS).

[9]  Stefan Schaal,et al.  Learning Control in Robotics , 2010, IEEE Robotics & Automation Magazine.

[10]  Angela P. Schoellig,et al.  Safe and robust learning control with Gaussian processes , 2015, 2015 European Control Conference (ECC).

[11]  Claire J. Tomlin,et al.  Guaranteed Safe Online Learning via Reachability: tracking a ground target using a quadrotor , 2012, 2012 IEEE International Conference on Robotics and Automation.

[12]  Franco Blanchini,et al.  Set invariance in control , 1999, Autom..

[13]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[14]  Aaron D. Ames,et al.  Continuity and smoothness properties of nonlinear optimization-based feedback controllers , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[15]  Aaron D. Ames,et al.  Adaptive Safety with Control Barrier Functions , 2019, 2020 American Control Conference (ACC).

[16]  Li Wang,et al.  Safe Learning of Quadrotor Dynamics Using Barrier Certificates , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Aaron D. Ames,et al.  Input-to-State Safety With Control Barrier Functions , 2018, IEEE Control Systems Letters.

[18]  Alkis Gotovos,et al.  Safe Exploration for Optimization with Gaussian Processes , 2015, ICML.

[19]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[20]  Aude Billard,et al.  Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions , 2014, Robotics Auton. Syst..

[21]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[22]  N. E. Toklu,et al.  Safe Interactive Model-Based Learning , 2019, ArXiv.

[23]  Soon-Jo Chung,et al.  Neural Lander: Stable Drone Landing Control Using Learned Dynamics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[24]  Aaron D. Ames,et al.  A Control Lyapunov Perspective on Episodic Learning via Projection to State Stability , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[25]  Aaron D. Ames,et al.  Sufficient conditions for the Lipschitz continuity of QP-based multi-objective control of humanoid robots , 2013, 52nd IEEE Conference on Decision and Control.

[26]  Z. Artstein Stabilization with relaxed controls , 1983 .

[27]  Paulo Tabuada,et al.  Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[28]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[29]  Aaron D. Ames,et al.  Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems* , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Jaime F. Fisac,et al.  Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.

[31]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[32]  Andreas Krause,et al.  Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[33]  Matteo Saveriano,et al.  Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  L. Perko Differential Equations and Dynamical Systems , 1991 .

[35]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[36]  Yisong Yue,et al.  Smooth Imitation Learning for Online Sequence Prediction , 2016, ICML.

[37]  Jonathan P. How,et al.  Bayesian Nonparametric Adaptive Control Using Gaussian Processes , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Claire J. Tomlin,et al.  Guaranteed safe online learning of a bounded system , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[39]  Geoffrey J. Gordon,et al.  No-Regret Reductions for Imitation Learning and Structured Prediction , 2010, ArXiv.

[40]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[41]  Andreas Krause,et al.  Safe controller optimization for quadrotors with Gaussian processes , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Torsten Koller,et al.  Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning , 2019, ArXiv.

[43]  Eduardo Sontag A universal construction of Artstein's theorem on nonlinear stabilization , 1989 .

[44]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[45]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[46]  Mrdjan Jankovic,et al.  Robust control barrier functions for constrained stabilization of nonlinear systems , 2018, Autom..

[47]  Paulo Tabuada,et al.  Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.