Learning-Based Safe Control for Robot and Autonomous Vehicle Using Efficient Safety Certificate

Energy-function-based safety certificates can provide demonstrable safety for complex automatic control systems used in safety control tasks. However, recent studies on learning-based energy function synthesis have only focused on feasibility, which can lead to over-conservatism and reduce controller efficiency. In this study, we propose using magnitude regularization techniques to enhance the efficiency of safe controllers by reducing conservativeness within the energy function while maintaining promising demonstrable safety guarantees. Specifically, we measure conservativeness by the magnitude of the energy function and reduce it by adding a magnitude regularization term to the integrated loss. We present the SafeMR algorithm, which is synthesized using reinforcement learning (RL) to unify the learning process of the safety controller and the energy function. To verify the effectiveness of the algorithm, we conducted two sets of experiments, one in a robot-based environment and the other in an autonomous vehicle environment. The experimental results demonstrate that the proposed approach reduces the conservativeness of the energy function and outperforms the baseline in terms of controller efficiency for the robot, while ensuring safety.

[1]  Hui Zhang,et al.  Fuel Economy-Oriented Vehicle Platoon Control Using Economic Model Predictive Control , 2022, IEEE Transactions on Intelligent Transportation Systems.

[2]  Haitong Ma,et al.  Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization , 2022, ArXiv.

[3]  Keqiang Li,et al.  Formation control for connected and automated vehicles on multi‐lane roads: Relative motion planning and conflict resolution , 2022, IET Intelligent Transport Systems.

[4]  Sheng Li,et al.  Reachability Constrained Reinforcement Learning , 2022, ICML.

[5]  Changliu Liu,et al.  Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning , 2021, ArXiv.

[6]  S. Li,et al.  Joint Synthesis of Safety Certificate and Safe Control Policy using Constrained Reinforcement Learning , 2021, L4DC.

[7]  Keqiang Li,et al.  Cooperation Method of Connected and Automated Vehicles at Unsignalized Intersections: Lane Changing and Arrival Scheduling , 2021, IEEE Transactions on Vehicular Technology.

[8]  Yuping Luo,et al.  Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations , 2021, NeurIPS.

[9]  Keqiang Li,et al.  Conflict-Free Cooperation Method for Connected and Automated Vehicles at Unsignalized Intersections: Graph-Based Modeling and Optimality Analysis , 2021, IEEE Transactions on Intelligent Transportation Systems.

[10]  Jaime F. Fisac,et al.  Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning , 2021, Robotics: Science and Systems.

[11]  Yang Guan,et al.  Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety , 2021, ArXiv.

[12]  Jingliang Duan,et al.  Integrated Decision and Control: Toward Interpretable and Computationally Efficient Driving Intelligence , 2021, IEEE Transactions on Cybernetics.

[13]  Shengbo Eben Li,et al.  Model-based Constrained Reinforcement Learning using Generalized Control Barrier Function , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  K. Zhang,et al.  Learning Safe Multi-Agent Control with Decentralized Neural Barrier Certificates , 2021, ICLR.

[15]  Sifa Zheng,et al.  Numerically Stable Dynamic Bicycle Model for Discrete-time Control , 2020, 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops).

[16]  Chaoyi Chen,et al.  Mixed platoon control of automated and human-driven vehicles at a signalized intersection: dynamical analysis and optimal control , 2020, Transportation Research Part C: Emerging Technologies.

[17]  K. Ross,et al.  First Order Constrained Optimization in Policy Space , 2020, NeurIPS.

[18]  Yang Zheng,et al.  Leading Cruise Control in Mixed Traffic Flow , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[19]  Shaoshuai Mou,et al.  Neural Certificates for Safe Control Policies , 2020, ArXiv.

[20]  Sicun Gao,et al.  Neural Lyapunov Control , 2020, NeurIPS.

[21]  Karthik Narasimhan,et al.  Projection-Based Constrained Policy Optimization , 2020, ICLR.

[22]  Samuel Coogan,et al.  Synthesis of Control Barrier Functions Using a Supervised Machine Learning Approach , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Yang Guan,et al.  Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Matteo Saveriano,et al.  Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Changliu Liu,et al.  Safe Control Algorithms Using Energy Functions: A Uni ed Framework, Benchmark, and New Directions , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[26]  Jaime F. Fisac,et al.  Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[27]  Paulo Tabuada,et al.  Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[28]  Gábor Orosz,et al.  End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[29]  Shie Mannor,et al.  Reward Constrained Policy Optimization , 2018, ICLR.

[30]  Yuval Tassa,et al.  Safe Exploration in Continuous Action Spaces , 2018, ArXiv.

[31]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[32]  Giovanni De Magistris,et al.  OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Mo Chen,et al.  Hamilton-Jacobi reachability: A brief overview and recent advances , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[34]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[35]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[36]  Aaron D. Ames,et al.  Safety Barrier Certificates for Collisions-Free Multirobot Systems , 2017, IEEE Transactions on Robotics.

[37]  Paulo Tabuada,et al.  Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[38]  Andreas Krause,et al.  Safe Exploration in Finite Markov Decision Processes with Gaussian Processes , 2016, NIPS.

[39]  Marco Pavone,et al.  Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..

[40]  Paulo Tabuada,et al.  Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[41]  Masayoshi Tomizuka,et al.  CONTROL IN A SAFE SET: ADDRESSING SAFETY IN HUMAN-ROBOT INTERACTIONS , 2014, HRI 2014.

[42]  Pieter Abbeel,et al.  Safe Exploration in Markov Decision Processes , 2012, ICML.

[43]  George J. Pappas,et al.  A Framework for Worst-Case and Stochastic Safety Verification Using Barrier Certificates , 2007, IEEE Transactions on Automatic Control.

[44]  Alexandre M. Bayen,et al.  A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.

[45]  Fritz Wysotzki,et al.  Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..

[46]  Changliu Liu,et al.  Model-free Safe Control for Zero-Violation Reinforcement Learning , 2021 .

[47]  Dario Amodei,et al.  Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .

[48]  Frank Allgöwer,et al.  CONSTRUCTIVE SAFETY USING CONTROL BARRIER FUNCTIONS , 2007 .

[49]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .