Stackelberg Meta-Learning for Strategic Guidance in Multi-Robot Trajectory Planning

Trajectory guidance requires a leader robotic agent to assist a follower robotic agent to cooperatively reach the target destination. However, planning cooperation becomes difficult when the leader serves a family of different followers and has incomplete information about the followers. There is a need for learning and fast adaptation of different cooperation plans. We develop a Stackelberg meta-learning approach to address this challenge. We first formulate the guided trajectory planning problem as a dynamic Stackelberg game to capture the leader-follower interactions. Then, we leverage meta-learning to develop cooperative strategies for different followers. The leader learns a meta-best-response model from a prescribed set of followers. When a specific follower initiates a guidance query, the leader quickly adapts to the follower-specific model with a small amount of learning data and uses it to perform trajectory guidance. We use simulations to elaborate that our method provides a better generalization and adaptation performance on learning followers' behavior than other learning approaches. The value and the effectiveness of guidance are also demonstrated by the comparison with zero guidance scenarios.

[1]  Huaimin Wang,et al.  CRMRL: Collaborative Relationship Meta Reinforcement Learning for Effectively Adapting to Type Changes in Multi-Robotic System , 2022, IEEE Robotics and Automation Letters.

[2]  Minghui Zhu,et al.  Meta Value Learning for Fast Policy-Centric Optimal Motion Planning , 2022, Robotics: Science and Systems.

[3]  Spencer M. Richards,et al.  Control-oriented meta-learning , 2022, Int. J. Robotics Res..

[4]  Quanyan Zhu,et al.  Stackelberg Strategic Guidance for Heterogeneous Robots Collaboration , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[5]  A. Richert,et al.  Concept of an Intuitive Human-Robot-Collaboration via Motion Tracking and Augmented Reality , 2021, 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA).

[6]  Mac Schwager,et al.  Game-Theoretic Planning for Self-Driving Cars in Multivehicle Competitive Scenarios , 2021, IEEE Transactions on Robotics.

[7]  Adrien Gaidon,et al.  Game-Theoretic Planning for Risk-Aware Interactive Agents , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Joewie J. Koh,et al.  Cooperative Control of Mobile Robots with Stackelberg Learning , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Chen Lv,et al.  Human-Like Decision Making for Autonomous Driving: A Noncooperative Game Theoretic Approach , 2020, IEEE Transactions on Intelligent Transportation Systems.

[10]  Danica Kragic,et al.  Fast Adaptation with Meta-Reinforcement Learning for Trust Modelling in Human-Robot Interaction , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Kyriakos G. Vamvoudakis,et al.  Robust Kinodynamic Motion Planning using Model-Free Game-Theoretic Learning , 2019, 2019 American Control Conference (ACC).

[12]  2019 International Conference on Robotics and Automation (ICRA) , 2019 .

[13]  Anca D. Dragan,et al.  Hierarchical Game-Theoretic Planning for Autonomous Vehicles , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[14]  Leslie Pack Kaelbling,et al.  Learning Quickly to Plan Quickly Using Modular Meta-Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[15]  Dirk Wollherr,et al.  Human-Like Motion Planning Based on Game Theoretic Decision Making , 2018, Int. J. Soc. Robotics.

[16]  M. G. Mohanan,et al.  A survey of robotic motion planning in dynamic environments , 2018, Robotics Auton. Syst..

[17]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[18]  Siddhartha S. Srinivasa,et al.  Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[19]  Han Bo,et al.  Human-robot collaboration for tooling path guidance , 2016, 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[20]  Anca D. Dragan,et al.  Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[21]  Mac Schwager,et al.  Kinematic multi-robot manipulation with no communication using force feedback , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Sergio Monteiro,et al.  Multi-constrained joint transportation tasks by teams of autonomous mobile robots using a dynamical systems approach , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Petros G. Voulgaris,et al.  Distributed Coordination Control for Multi-Robot Networks Using Lyapunov-Like Barrier Functions , 2016, IEEE Transactions on Automatic Control.

[24]  Paulo Tabuada,et al.  Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[25]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[26]  Emilio Frazzoli,et al.  Game theoretic controller synthesis for multi-robot motion planning Part I: Trajectory based algorithms , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Marco Bibuli,et al.  Guidance of Unmanned Surface Vehicles: Experiments in Vehicle Following , 2012, IEEE Robotics & Automation Magazine.

[28]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[29]  Stephen P. Boyd,et al.  Fast Model Predictive Control Using Online Optimization , 2010, IEEE Transactions on Control Systems Technology.

[30]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[31]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[32]  Lydia E. Kavraki,et al.  Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[33]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[34]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[35]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .