论文信息 - Learning Game-Theoretic Models of Multiagent Trajectories Using Implicit Layers

Learning Game-Theoretic Models of Multiagent Trajectories Using Implicit Layers

For prediction of interacting agents' trajectories, we propose an end-to-end trainable architecture that hybridizes neural nets with game-theoretic reasoning, has interpretable intermediate representations, and transfers to downstream decision making. It uses a net that reveals preferences from the agents' past joint trajectory, and a differentiable implicit layer that maps these preferences to local Nash equilibria, forming the modes of the predicted future trajectory. Additionally, it learns an equilibrium refinement concept. For tractability, we introduce a new class of continuous potential games and an equilibrium-separating partition of the action space. We provide theoretical results for explicit gradients and soundness. In experiments, we evaluate our approach on two real-world data sets, where we predict highway driver merging trajectories, and on a simple decision-making transfer task.

Christoph-Nikolas Straehle | Philipp Geiger | C. Straehle | Philipp Geiger

[1] Anca D. Dragan,et al. Hierarchical Game-Theoretic Planning for Autonomous Vehicles , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[2] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[3] Xingyu Wang,et al. Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations , 2018, ICML.

[4] Ruslan Salakhutdinov,et al. Multiple Futures Prediction , 2019, NeurIPS.

[5] Silvio Savarese,et al. Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[7] Dariu M. Gavrila,et al. Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[8] Bernt Schiele,et al. Conditional Flow Variational Autoencoders for Structured Sequence Prediction , 2019, ArXiv.

[9] Vladlen Koltun,et al. Deep Equilibrium Models , 2019, NeurIPS.

[10] Laurent El Ghaoui,et al. Implicit Deep Learning , 2019, SIAM J. Math. Data Sci..

[11] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12] J. Zico Kolter,et al. What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[13] Sergey Levine,et al. Causal Confusion in Imitation Learning , 2019, NeurIPS.

[14] Henry X. Liu,et al. A Game Theoretical Approach for Modelling Merging and Yielding Behavior at Freeway On-Ramp Sections , 2007 .

[15] Reza Langari,et al. Addressing Mandatory Lane Change Problem with Game Theoretic Model Predictive Control and Fuzzy Markov Chain , 2018, 2018 Annual American Control Conference (ACC).

[16] Silvio Savarese,et al. Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Keishi Tanimoto,et al. A game theoretic analysis of merging-giveway interaction: a joint estimation model , 2002 .

[18] Wenshuo Wang,et al. Spatiotemporal Learning of Multivehicle Interaction Patterns in Lane-Change Scenarios , 2020, IEEE Transactions on Intelligent Transportation Systems.

[19] Wei Zhan,et al. Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Reinforcement Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[20] Shiqian Ma,et al. Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization , 2014, SIAM J. Optim..

[21] Yingkai Li,et al. Implementation of Stochastic Quasi-Newton's Method in PyTorch , 2018, ArXiv.

[22] Hesham Rakha,et al. Game Theoretical Approach to Model Decision Making for Merging Maneuvers at Freeway On-Ramps , 2017 .

[23] D. Stahl,et al. On Players' Models of Other Players: Theory and Experimental Evidence , 1995 .

[24] Natasha Merat,et al. When Should the Chicken Cross the Road? - Game Theory for Autonomous Vehicle - Human Interactions , 2018, VEHITS.

[25] Hideyuki Kita,et al. A merging–giveway interaction model of cars in a merging section: a game theoretic analysis , 1999 .

[26] Lutz Eckstein,et al. The highD Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving Systems , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[27] Dirk Helbing,et al. General Lane-Changing Model MOBIL for Car-Following Models , 2007 .

[28] Helbing,et al. Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[29] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[30] David Fridovich-Keil,et al. Inference-Based Strategy Alignment for General-Sum Differential Games , 2020, AAMAS.

[31] J. Zico Kolter,et al. Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games , 2019, AAAI.

[32] Byron Boots,et al. Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.

[33] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[34] S. Shankar Sastry,et al. On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[35] Bernt Schiele,et al. Haar Wavelet Based Block Autoregressive Flows for Trajectories , 2020, GCPR.

[36] Tamer Basar,et al. Non-Cooperative Inverse Reinforcement Learning , 2019, NeurIPS.

[37] Jalal Etesami,et al. Causal Transfer for Imitation Learning and Decision Making under Sensor-shift , 2020, AAAI.

[38] Jalal Etesami,et al. Non-cooperative Multi-agent Systems with Exploring Agents , 2020, ArXiv.

[39] Kris M. Kitani,et al. Forecasting Interactive Dynamics of Pedestrians with Fictitious Play , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Kevin Leyton-Brown,et al. Deep Learning for Predicting Human Strategic Behavior , 2016, NIPS.

[41] Natasha Merat,et al. Empirical game theory of pedestrian interaction for autonomous vehicles , 2018 .

[42] Claudia Blaiotta,et al. Learning Generative Socially Aware Models of Pedestrian Motion , 2019, IEEE Robotics and Automation Letters.

[43] Wolfram Burgard,et al. Feature-Based Prediction of Trajectories for Socially Compliant Navigation , 2012, Robotics: Science and Systems.

[44] Mohan M. Trivedi,et al. Convolutional Social Pooling for Vehicle Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45] Mac Schwager,et al. A Real-Time Game Theoretic Planner for Autonomous Two-Player Drone Racing , 2018, Robotics: Science and Systems.

[46] Silvio Savarese,et al. Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[47] S. Shankar Sastry,et al. Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[48] Santiago Zazo,et al. Dynamic Potential Games With Constraints: Fundamentals and Applications in Communications , 2016, IEEE Transactions on Signal Processing.

[49] Nan Li,et al. Adaptive Game-Theoretic Decision Making for Autonomous Vehicle Control at Roundabouts , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[50] Katja Hofmann,et al. Experimental and causal view on information integration in autonomous agents , 2016, ArXiv.

[51] Nan Li,et al. Game Theoretic Modeling of Vehicle Interactions at Unsignalized Intersections and Application to Autonomous Vehicle Control , 2018, 2018 Annual American Control Conference (ACC).

[52] Helbing,et al. Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[53] Marco Pavone,et al. Trajectron++: Multi-Agent Generative Trajectory Forecasting With Heterogeneous Data for Control , 2020, ArXiv.