论文信息 - Stackelberg Meta-Learning for Strategic Guidance in Multi-Robot Trajectory Planning

Stackelberg Meta-Learning for Strategic Guidance in Multi-Robot Trajectory Planning

Trajectory guidance requires a leader robotic agent to assist a follower robotic agent to cooperatively reach the target destination. However, planning cooperation becomes difficult when the leader serves a family of different followers and has incomplete information about the followers. There is a need for learning and fast adaptation of different cooperation plans. We develop a Stackelberg meta-learning approach to address this challenge. We first formulate the guided trajectory planning problem as a dynamic Stackelberg game to capture the leader-follower interactions. Then, we leverage meta-learning to develop cooperative strategies for different followers. The leader learns a meta-best-response model from a prescribed set of followers. When a specific follower initiates a guidance query, the leader quickly adapts to the follower-specific model with a small amount of learning data and uses it to perform trajectory guidance. We use simulations to elaborate that our method provides a better generalization and adaptation performance on learning followers' behavior than other learning approaches. The value and the effectiveness of guidance are also demonstrated by the comparison with zero guidance scenarios.

Quanyan Zhu | Yuhan Zhao

[1] Huaimin Wang,et al. CRMRL: Collaborative Relationship Meta Reinforcement Learning for Effectively Adapting to Type Changes in Multi-Robotic System , 2022, IEEE Robotics and Automation Letters.

[2] Minghui Zhu,et al. Meta Value Learning for Fast Policy-Centric Optimal Motion Planning , 2022, Robotics: Science and Systems.

[3] Spencer M. Richards,et al. Control-oriented meta-learning , 2022, Int. J. Robotics Res..

[4] Quanyan Zhu,et al. Stackelberg Strategic Guidance for Heterogeneous Robots Collaboration , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[5] A. Richert,et al. Concept of an Intuitive Human-Robot-Collaboration via Motion Tracking and Augmented Reality , 2021, 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA).

[6] Mac Schwager,et al. Game-Theoretic Planning for Self-Driving Cars in Multivehicle Competitive Scenarios , 2021, IEEE Transactions on Robotics.

[7] Adrien Gaidon,et al. Game-Theoretic Planning for Risk-Aware Interactive Agents , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8] Joewie J. Koh,et al. Cooperative Control of Mobile Robots with Stackelberg Learning , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Chen Lv,et al. Human-Like Decision Making for Autonomous Driving: A Noncooperative Game Theoretic Approach , 2020, IEEE Transactions on Intelligent Transportation Systems.

[10] Danica Kragic,et al. Fast Adaptation with Meta-Reinforcement Learning for Trust Modelling in Human-Robot Interaction , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11] Kyriakos G. Vamvoudakis,et al. Robust Kinodynamic Motion Planning using Model-Free Game-Theoretic Learning , 2019, 2019 American Control Conference (ACC).

[12] 2019 International Conference on Robotics and Automation (ICRA) , 2019 .

[13] Anca D. Dragan,et al. Hierarchical Game-Theoretic Planning for Autonomous Vehicles , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[14] Leslie Pack Kaelbling,et al. Learning Quickly to Plan Quickly Using Modular Meta-Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[15] Dirk Wollherr,et al. Human-Like Motion Planning Based on Game Theoretic Decision Making , 2018, Int. J. Soc. Robotics.

[16] M. G. Mohanan,et al. A survey of robotic motion planning in dynamic environments , 2018, Robotics Auton. Syst..

[17] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[18] Siddhartha S. Srinivasa,et al. Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[19] Han Bo,et al. Human-robot collaboration for tooling path guidance , 2016, 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[20] Anca D. Dragan,et al. Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[21] Mac Schwager,et al. Kinematic multi-robot manipulation with no communication using force feedback , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22] Sergio Monteiro,et al. Multi-constrained joint transportation tasks by teams of autonomous mobile robots using a dynamical systems approach , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23] Petros G. Voulgaris,et al. Distributed Coordination Control for Multi-Robot Networks Using Lyapunov-Like Barrier Functions , 2016, IEEE Transactions on Automatic Control.

[24] Paulo Tabuada,et al. Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[25] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[26] Emilio Frazzoli,et al. Game theoretic controller synthesis for multi-robot motion planning Part I: Trajectory based algorithms , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[27] Marco Bibuli,et al. Guidance of Unmanned Surface Vehicles: Experiments in Vehicle Following , 2012, IEEE Robotics & Automation Magazine.

[28] Emilio Frazzoli,et al. Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[29] Stephen P. Boyd,et al. Fast Model Predictive Control Using Online Optimization , 2010, IEEE Transactions on Control Systems Technology.

[30] Sarit Kraus,et al. Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[31] Steven M. LaValle,et al. Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[32] Lydia E. Kavraki,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[33] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[34] B. Faverjon,et al. Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[35] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .