Multiagent trajectory models via game theory and implicit layer-based learning

For prediction of interacting agents' trajectories, we propose an end-to-end trainable architecture that hybridizes neural nets with game-theoretic reasoning, has interpretable intermediate representations, and transfers to robust downstream decision making. It combines (1) a differentiable implicit layer that maps preferences to local Nash equilibria with (2) a learned equilibrium refinement concept and (3) a learned preference revelation net, given initial trajectories as input. This is accompanied by a new class of continuous potential games. We provide theoretical results for explicit gradients and soundness, and several measures to ensure tractability. In experiments, we evaluate our approach on two real-world data sets, where we predict highway driver merging trajectories, and on a simple decision-making transfer task.

[1]  Santiago Zazo,et al.  Dynamic Potential Games With Constraints: Fundamentals and Applications in Communications , 2016, IEEE Transactions on Signal Processing.

[2]  Hideyuki Kita,et al.  A merging–giveway interaction model of cars in a merging section: a game theoretic analysis , 1999 .

[3]  Claudia Blaiotta,et al.  Learning Generative Socially Aware Models of Pedestrian Motion , 2019, IEEE Robotics and Automation Letters.

[4]  Wolfram Burgard,et al.  Feature-Based Prediction of Trajectories for Socially Compliant Navigation , 2012, Robotics: Science and Systems.

[5]  Mohan M. Trivedi,et al.  Convolutional Social Pooling for Vehicle Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Ruslan Salakhutdinov,et al.  Multiple Futures Prediction , 2019, NeurIPS.

[7]  Mac Schwager,et al.  A Real-Time Game Theoretic Planner for Autonomous Two-Player Drone Racing , 2018, Robotics: Science and Systems.

[8]  Nan Li,et al.  Adaptive Game-Theoretic Decision Making for Autonomous Vehicle Control at Roundabouts , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[9]  Marco Pavone,et al.  Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data , 2020, ECCV.

[10]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[11]  Lutz Eckstein,et al.  The highD Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving Systems , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[12]  J. Zico Kolter,et al.  What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[13]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[14]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[15]  Nan Li,et al.  Game Theoretic Modeling of Vehicle Interactions at Unsignalized Intersections and Application to Autonomous Vehicle Control , 2018, 2018 Annual American Control Conference (ACC).

[16]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Anca D. Dragan,et al.  Hierarchical Game-Theoretic Planning for Autonomous Vehicles , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[18]  Gergely V. Záruba,et al.  Inverse reinforcement learning for decentralized non-cooperative multiagent systems , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[19]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[20]  Xingyu Wang,et al.  Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations , 2018, ICML.

[21]  Marco Pavone,et al.  Trajectron++: Multi-Agent Generative Trajectory Forecasting With Heterogeneous Data for Control , 2020, ArXiv.

[22]  Jalal Etesami,et al.  Non-cooperative Multi-agent Systems with Exploring Agents , 2020, ArXiv.

[23]  Kris M. Kitani,et al.  Forecasting Interactive Dynamics of Pedestrians with Fictitious Play , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kevin Leyton-Brown,et al.  Deep Learning for Predicting Human Strategic Behavior , 2016, NIPS.

[25]  Vladlen Koltun,et al.  Deep Equilibrium Models , 2019, NeurIPS.

[26]  Hesham Rakha,et al.  Game Theoretical Approach to Model Decision Making for Merging Maneuvers at Freeway On-Ramps , 2017 .

[27]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[28]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[29]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[31]  Keishi Tanimoto,et al.  A game theoretic analysis of merging-giveway interaction: a joint estimation model , 2002 .

[32]  J. Zico Kolter,et al.  Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games , 2019, AAAI.

[33]  Tamer Basar,et al.  Non-Cooperative Inverse Reinforcement Learning , 2019, NeurIPS.

[34]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[35]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[36]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[37]  David Fridovich-Keil,et al.  Inference-Based Strategy Alignment for General-Sum Differential Games , 2020, AAMAS.

[38]  Natasha Merat,et al.  Empirical game theory of pedestrian interaction for autonomous vehicles , 2018 .

[39]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[40]  S. Shankar Sastry,et al.  On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[41]  Henry X. Liu,et al.  A Game Theoretical Approach for Modelling Merging and Yielding Behavior at Freeway On-Ramp Sections , 2007 .

[42]  Reza Langari,et al.  Addressing Mandatory Lane Change Problem with Game Theoretic Model Predictive Control and Fuzzy Markov Chain , 2018, 2018 Annual American Control Conference (ACC).

[43]  Wenshuo Wang,et al.  Spatiotemporal Learning of Multivehicle Interaction Patterns in Lane-Change Scenarios , 2020, IEEE Transactions on Intelligent Transportation Systems.

[44]  Wei Zhan,et al.  Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Reinforcement Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[45]  Shiqian Ma,et al.  Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization , 2014, SIAM J. Optim..

[46]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .

[47]  Yingkai Li,et al.  Implementation of Stochastic Quasi-Newton's Method in PyTorch , 2018, ArXiv.

[48]  L. Shapley,et al.  Potential Games , 1994 .

[49]  D. Stahl,et al.  On Players' Models of Other Players: Theory and Experimental Evidence , 1995 .

[50]  Natasha Merat,et al.  When Should the Chicken Cross the Road? - Game Theory for Autonomous Vehicle - Human Interactions , 2018, VEHITS.

[51]  John C. Harsanyi,et al.  Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .

[52]  Laurent El Ghaoui,et al.  Implicit Deep Learning , 2019, SIAM J. Math. Data Sci..

[53]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.