Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding

Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians, for safe and reliable decision-making. Due to partial observability over the goals, contexts, and interactions of agents in these dynamical scenes, directly obtaining the posterior distribution over future agent trajectories remains a challenging problem. In realistic embodied environments, each agent's future trajectories should be diverse since multiple plausible sequences of actions can be used to reach its intended goals, and they should be admissible since they must obey physical constraints and stay in drivable areas. In this paper, we propose a model that fully synthesizes multiple input signals from the multimodal world|the environment's scene context and interactions between multiple surrounding agents|to best model all diverse and admissible trajectories. We offer new metrics to evaluate the diversity of trajectory predictions, while ensuring admissibility of each trajectory. Based on our new metrics as well as those used in prior work, we compare our model with strong baselines and ablations across two datasets and show a 35% performance-improvement over the state-of-the-art.

[1]  Ruslan Salakhutdinov,et al.  Strong and Simple Baselines for Multimodal Utterance Embeddings , 2019, NAACL.

[2]  Chen Sun,et al.  Stochastic Prediction of Multi-Agent Interactions from Partial Observations , 2019, ICLR.

[3]  I. Shapiro The prediction of satellite orbits , 1963 .

[4]  Henggang Cui,et al.  Short-term Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks , 2018 .

[5]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[6]  Jianqiang Wang,et al.  Vehicle Trajectory Prediction by Integrating Physics- and Maneuver-Based Approaches Using Interactive Multiple Models , 2018, IEEE Transactions on Industrial Electronics.

[7]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  D. Bernstein,et al.  Some explicit formulas for the matrix exponential , 1993, IEEE Trans. Autom. Control..

[9]  Piotr Borkowski,et al.  The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion , 2017, Sensors.

[10]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[11]  Mohan M. Trivedi,et al.  Convolutional Social Pooling for Vehicle Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[14]  Rüdiger Dillmann,et al.  Learning Driver Behavior Models from Traffic Observations for Decision Making and Planning , 2015, IEEE Intelligent Transportation Systems Magazine.

[15]  Lutz Eckstein,et al.  The highD Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving Systems , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[16]  Ying Nian Wu,et al.  Multi-Agent Tensor Fusion for Contextual Trajectory Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[18]  Patrick Lucey,et al.  Where Will They Go? Predicting Fine-Grained Adversarial Multi-agent Motion Using Conditional Variational Autoencoders , 2018, ECCV.

[19]  Paul Vernaza,et al.  r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting , 2018, ECCV.

[20]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  L. Verlet Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .

[22]  Hanan Samet,et al.  Aircraft Trajectory Prediction Made Easy with Predictive Analytics , 2016, KDD.

[23]  Dinesh Manocha,et al.  TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents , 2018, AAAI.

[24]  Silvio Savarese,et al.  Knowledge Transfer for Scene-Specific Motion Prediction , 2016, ECCV.

[25]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Henggang Cui,et al.  Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[28]  Mayank Bansal,et al.  ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst , 2018, Robotics: Science and Systems.

[29]  Dariu Gavrila,et al.  Context-Based Pedestrian Path Prediction , 2014, ECCV.

[30]  Ruslan Salakhutdinov,et al.  Multiple Futures Prediction , 2019, NeurIPS.

[31]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[32]  Barnabás Póczos,et al.  Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities , 2018, AAAI.

[33]  Hongdong Li,et al.  Action Anticipation By Predicting Future Dynamic Images , 2018, ECCV Workshops.

[34]  Chung Choo Chung,et al.  Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Kris Kitani,et al.  Diverse Trajectory Forecasting with Determinantal Point Processes , 2019, ICLR.

[37]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[38]  Jean Oh,et al.  Social Attention: Modeling Attention in Human Crowds , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[39]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[41]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[42]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[43]  Chung Choo Chung,et al.  Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[44]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[45]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[46]  Sridha Sridharan,et al.  Soft + Hardwired Attention: An LSTM Framework for Human Trajectory Prediction and Abnormal Event Detection , 2017, Neural Networks.

[47]  Yuichiro Yoshikawa,et al.  Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Sergey Levine,et al.  PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Darius Burschka,et al.  Interaction-Aware Probabilistic Behavior Prediction in Urban Environments , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[50]  Xin Huang,et al.  DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling , 2020, IEEE Robotics and Automation Letters.

[51]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[52]  Sergio Casas,et al.  IntentNet: Learning to Predict Intention from Raw Sensor Data , 2018, CoRL.

[53]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.