论文信息 - Neural Certificates for Safe Control Policies

Neural Certificates for Safe Control Policies

This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.

[1] Paulo Tabuada,et al. Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[2] Andreas Krause,et al. The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems , 2018, CoRL.

[3] J. Doyle,et al. NONLINEAR OPTIMAL CONTROL: A CONTROL LYAPUNOV FUNCTION AND RECEDING HORIZON PERSPECTIVE , 1999 .

[4] Li Wang,et al. Safe Learning of Quadrotor Dynamics Using Barrier Certificates , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5] Sicun Gao,et al. Neural Lyapunov Control , 2020, NeurIPS.

[6] Yuandan Lin,et al. A Smooth Converse Lyapunov Theorem for Robust Stability , 1996 .

[7] Eduardo Sontag. A Lyapunov-Like Characterization of Asymptotic Controllability , 1983, SIAM Journal on Control and Optimization.

[8] Matteo Saveriano,et al. Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Ali Jadbabaie,et al. Safety Verification of Hybrid Systems Using Barrier Certificates , 2004, HSCC.

[10] Paulo Tabuada,et al. Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[11] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[12] Frank Allgöwer,et al. CONSTRUCTIVE SAFETY USING CONTROL BARRIER FUNCTIONS , 2007 .

[13] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[14] J. Zico Kolter,et al. Learning Stable Deep Dynamics Models , 2020, NeurIPS.

[15] Pablo A. Parrilo,et al. Introducing SOSTOOLS: a general purpose sum of squares programming solver , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[16] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[17] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[18] Franco Blanchini,et al. Set-theoretic methods in control , 2007 .

[19] Sriram Sankaranarayanan,et al. Learning control lyapunov functions from counterexamples and demonstrations , 2018, Autonomous Robots.

[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22] Franco Blanchini,et al. Set invariance in control , 1999, Autom..

[23] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.

[24] Yisong Yue,et al. Learning for Safety-Critical Control with Control Barrier Functions , 2019, L4DC.

[25] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[26] Samuel Coogan,et al. Synthesis of Control Barrier Functions Using a Supervised Machine Learning Approach , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).