Multi-Agent Imitation Learning with Copulas

Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions, which is essential for understanding physical, social, and team-play systems. However, most existing works on modeling multi-agent interactions typically assume that agents make independent decisions based on their observations, ignoring the complex dependence among agents. In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems. Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents. Extensive experiments on synthetic and real-world datasets show that our model outperforms state-of-the-art baselines across various scenarios in the action prediction task, and is able to generate new trajectories close to expert demonstrations.

[1]  Stefano Ermon,et al.  Multi-Agent Generative Adversarial Imitation Learning , 2018, NeurIPS.

[2]  Michael H. Bowling,et al.  Apprenticeship learning using linear programming , 2008, ICML '08.

[3]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[4]  Eric Bouyé,et al.  Copulas for Finance - A Reading Guide and Some Applications , 2000 .

[5]  P. Embrechts,et al.  Dependence modeling with copulas , 2007 .

[6]  M. Haugh,et al.  An Introduction to Copulas , 2016 .

[7]  Ming Zhou,et al.  Multi-Agent Interactions Modeling with Correlated Policies , 2020, ICLR.

[8]  Richard A. Davis,et al.  Remarks on Some Nonparametric Estimates of a Density Function , 2011 .

[9]  Mykel J. Kochenderfer,et al.  Multi-Agent Imitation Learning for Driving Simulation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ying Wen,et al.  A Regularized Opponent Model with Maximum Entropy Objective , 2019, IJCAI.

[12]  Alexander G. Schwing,et al.  Diverse Generation for Multi-Agent Sports Games , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[14]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[15]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[18]  Marco Pavone,et al.  Generative Modeling of Multimodal Multi-Human Behavior , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[20]  Lantao Yu,et al.  Multi-Agent Adversarial Inverse Reinforcement Learning , 2019, ICML.

[21]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[22]  Yisong Yue,et al.  Coordinated Multi-Agent Imitation Learning , 2017, ICML.

[23]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[24]  Sergey Levine,et al.  Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.

[25]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[26]  Yan Liu,et al.  Generative Attention Networks for Multi-Agent Behavioral Modeling , 2020, AAAI.

[27]  Yisong Yue,et al.  Generating Multi-Agent Trajectories using Programmatic Weak Supervision , 2018, ICLR.

[28]  Peter Stone,et al.  Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..

[29]  Oliver Obst,et al.  RoboCupSimData: A RoboCup soccer research dataset , 2017, ArXiv.

[30]  Yedid Hoshen,et al.  VAIN: Attentional Multi-agent Predictive Modeling , 2017, NIPS.

[31]  Noam Brown,et al.  Superhuman AI for multiplayer poker , 2019, Science.

[32]  Tianshu Chu,et al.  Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control , 2019, IEEE Transactions on Intelligent Transportation Systems.

[33]  Stuart J. Russell Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.

[34]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[35]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[36]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[37]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .