baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents

In many multi-agent spatiotemporal systems, the agents are under the influence of shared, unobserved variables (e.g., the play a team is executing in a game of basketball). As a result, the trajectories of the agents are often statistically dependent at any given time step; however, almost universally, multi-agent models implicitly assume the agents’ trajectories are statistically independent at each time step. In this paper, we introduce baller2vec++1, a multi-entity Transformer that can effectively model coordinated agents. Specifically, baller2vec++ applies a specially designed self-attention mask to a mixture of location and “look-ahead” trajectory sequences to learn the distributions of statistically dependent agent trajectories. We show that, unlike baller2vec (baller2vec++’s predecessor), baller2vec++ can learn to emulate the behavior of perfectly coordinated agents in a simulated toy dataset. Additionally, when modeling the trajectories of professional basketball players, baller2vec++ outperforms baller2vec by a wide margin.

[1]  Michael A. Alcorn (batter|pitcher)2vec: Statistic-Free Talent Modeling With Neural Player Embeddings , 2018 .

[2]  Dinesh Manocha,et al.  TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Tim Salimans,et al.  Axial Attention in Multidimensional Transformers , 2019, ArXiv.

[4]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Lukasz Kaiser,et al.  Reformer: The Efficient Transformer , 2020, ICLR.

[6]  Gedas Bertasius,et al.  Is Space-Time Attention All You Need for Video Understanding? , 2021, ICML.

[7]  Jure Leskovec,et al.  Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks , 2019, KDD.

[8]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[9]  Li Yang,et al.  Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.

[10]  Anh Nguyen,et al.  baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling , 2021, ArXiv.

[11]  Yisong Yue,et al.  Generating Multi-Agent Trajectories using Programmatic Weak Supervision , 2018, ICLR.

[12]  Shuai Yi,et al.  Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction , 2020, ECCV.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[15]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Davide Eynard,et al.  Temporal Graph Networks for Deep Learning on Dynamic Graphs , 2020, ArXiv.

[17]  Yisong Yue,et al.  Generating Long-term Trajectories Using Deep Hierarchical Networks , 2016, NIPS.

[18]  Mohan M. Trivedi,et al.  Convolutional Social Pooling for Vehicle Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[20]  Patrick Lucey,et al.  Where Will They Go? Predicting Fine-Grained Adversarial Multi-agent Motion Using Conditional Variational Autoencoders , 2018, ECCV.

[21]  Ilya Sutskever,et al.  Generating Long Sequences with Sparse Transformers , 2019, ArXiv.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ying Nian Wu,et al.  Multi-Agent Tensor Fusion for Contextual Trajectory Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alexander G. Schwing,et al.  Diverse Generation for Multi-Agent Sports Games , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Silvio Savarese,et al.  Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks , 2019, NeurIPS.