Learning Safe Multi-Agent Control with Decentralized Neural Barrier Certificates

We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals. Our core idea is to learn the multi-agent control policy jointly with learning the control barrier functions as safety certificates. We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes. Such a decentralized framework can adapt to an arbitrarily large number of agents. Building upon this framework, we further improve the scalability by incorporating neural network architectures that are invariant to the quantity and permutation of neighboring agents. In addition, we propose a new spontaneous policy refinement method to further enforce the certificate condition during testing. We provide extensive experiments to demonstrate that our method significantly outperforms other leading multi-agent control approaches in terms of maintaining safety and completing original tasks. Our approach also shows exceptional generalization capability in that the control policy can be trained with 8 agents in one scenario, while being used on other scenarios with up to 1024 agents in complex multi-agent environments and dynamics.

[1]  Osbert Bastani,et al.  MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding , 2019, ArXiv.

[2]  Weinan Zhang,et al.  MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence , 2017, AAAI.

[3]  Yisong Yue,et al.  Learning for Safety-Critical Control with Control Barrier Functions , 2019, L4DC.

[4]  J. Burdick,et al.  Safe Multi-Agent Interaction through Robust Control Barrier Functions with Learned Uncertainties , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[5]  Koushil Sreenath,et al.  Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions , 2020, Robotics: Science and Systems.

[6]  Huei Peng,et al.  Obstacle Avoidance for Low-Speed Autonomous Vehicles With Barrier Function , 2018, IEEE Transactions on Control Systems Technology.

[7]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[8]  Aaron D. Ames,et al.  Safety Barrier Certificates for Collisions-Free Multirobot Systems , 2017, IEEE Transactions on Robotics.

[9]  Alexander G. Schwing,et al.  PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning , 2019, CoRL.

[10]  Alexandre M. Bayen,et al.  A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.

[11]  Chuchu Fan,et al.  Fast and Guaranteed Safe Controller Synthesis for Nonlinear Vehicle Models , 2020, CAV.

[12]  Jonathan P. How,et al.  Socially aware motion planning with deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Matteo Saveriano,et al.  Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[16]  Samuel Coogan,et al.  Synthesis of Control Barrier Functions Using a Supervised Machine Learning Approach , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Ambuj Tewari,et al.  Smoothness, Low Noise and Fast Rates , 2010, NIPS.

[18]  Chuchu Fan,et al.  Scalable and Safe Multi-Agent Motion Planning with Nonlinear Dynamics and Bounded Disturbances , 2020, AAAI.

[19]  George J. Pappas,et al.  A Framework for Worst-Case and Stochastic Safety Verification Using Barrier Certificates , 2007, IEEE Transactions on Automatic Control.

[20]  Peter J. Stuckey,et al.  Searching with Consistent Prioritization for Multi-Agent Path Finding , 2018, AAAI.

[21]  Li Wang,et al.  Control Barrier Certificates for Safe Swarm Behavior , 2015, ADHS.

[22]  Dimos V. Dimarogonas,et al.  Learning Control Barrier Functions from Expert Demonstrations , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[23]  Li Wang,et al.  Safe Learning of Quadrotor Dynamics Using Barrier Certificates , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Jonathan P. How,et al.  Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Tamer Basar,et al.  Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.

[26]  Paul A. Beardsley,et al.  Optimal Reciprocal Collision Avoidance for Multiple Non-Holonomic Robots , 2010, DARS.

[27]  Paulo Tabuada,et al.  Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[28]  Paulo Tabuada,et al.  Correctness Guarantees for the Composition of Lane Keeping and Adaptive Cruise Control , 2016, IEEE Transactions on Automation Science and Engineering.

[29]  Dusan M. Stipanovic,et al.  Trajectory tracking with collision avoidance for nonholonomic vehicles with acceleration constraints and limited sensing , 2014, Int. J. Robotics Res..

[30]  Frank Allgöwer,et al.  CONSTRUCTIVE SAFETY USING CONTROL BARRIER FUNCTIONS , 2007 .

[31]  Shaoshuai Mou,et al.  Neural Certificates for Safe Control Policies , 2020, ArXiv.

[32]  Gábor Orosz,et al.  End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[33]  Magnus Egerstedt,et al.  Nonsmooth Barrier Functions With Applications to Multi-Robot Systems , 2017, IEEE Control Systems Letters.

[34]  Liujing Wang,et al.  Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning , 2019, IEEE Access.

[35]  Aaron D. Ames,et al.  Guaranteed Obstacle Avoidance for Multi-Robot Operations With Limited Actuation: A Control Barrier Function Approach , 2021, IEEE Control Systems Letters.

[36]  Paulo Tabuada,et al.  Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[37]  Nikolai Matni,et al.  Closing the Closed-Loop Distribution Shift in Safe Imitation Learning , 2021, ArXiv.

[38]  Sicun Gao,et al.  Neural Lyapunov Control , 2020, NeurIPS.

[39]  Dinesh Manocha,et al.  Reciprocal Velocity Obstacles for real-time multi-agent navigation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[40]  Nikolai Matni,et al.  Learning Stability Certificates from Data , 2020, CoRL.