Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

In order to plan a safe maneuver an autonomous vehicle must accurately perceive its environment, and understand the interactions among traffic participants. In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. In particular, we propose to characterize the joint distribution over future trajectories via an implicit latent variable model. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene. Coupled with a deterministic decoder, we obtain trajectory samples that are consistent across traffic participants, achieving state-of-the-art results in motion forecasting and interaction understanding. Last but not least, we demonstrate that our motion forecasts result in safer and more comfortable motion planning.

[1]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[3]  Renjie Liao,et al.  Discrete Residual Flow for Probabilistic Pedestrian Behavior Prediction , 2019, CoRL.

[4]  Ersin Yumer,et al.  Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[6]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[7]  J. Eggert,et al.  Managing the complexity of inner-city scenes: An efficient situation hypotheses selection scheme , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[8]  Sergio Casas,et al.  End-To-End Interpretable Neural Motion Planner , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[10]  Sergio Casas,et al.  RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects , 2020, ECCV.

[11]  Yisong Yue,et al.  Coordinated Multi-Agent Imitation Learning , 2017, ICML.

[12]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[13]  Renjie Liao,et al.  DSDNet: Deep Structured self-Driving Network , 2020, ECCV.

[14]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[17]  Bin Yang,et al.  Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Mark E. Campbell,et al.  Contingency Planning Over Probabilistic Obstacle Predictions for Autonomous Road Vehicles , 2013, IEEE Transactions on Robotics.

[19]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Renjie Liao,et al.  Learning Lane Graph Representations for Motion Forecasting , 2020, ECCV.

[21]  Benjamin Sapp,et al.  MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction , 2019, CoRL.

[22]  Ruslan Salakhutdinov,et al.  Multiple Futures Prediction , 2019, NeurIPS.

[23]  Bin Yang,et al.  Multi-Task Multi-Sensor Fusion for 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[25]  Bin Yang,et al.  HDNET: Exploiting HD Maps for 3D Object Detection , 2018, CoRL.

[26]  Helbing,et al.  Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[27]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[28]  Yedid Hoshen,et al.  VAIN: Attentional Multi-agent Predictive Modeling , 2017, NIPS.

[29]  Henggang Cui,et al.  Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[30]  Paul Vernaza,et al.  r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting , 2018, ECCV.

[31]  Henggang Cui,et al.  Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks , 2018, ArXiv.

[32]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Mykel J. Kochenderfer,et al.  Multi-Agent Imitation Learning for Driving Simulation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Yann LeCun,et al.  Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic , 2019, ICLR.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Renjie Liao,et al.  SpAGNN: Spatially-Aware Graph Neural Networks for Relational Behavior Forecasting from Sensor Data , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Pietro Perona,et al.  DDT: Deep Driving Tree for Proactive Planning in Interactive Scenarios , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[38]  Benjamin Sapp,et al.  Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Marco Pavone,et al.  The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Denis Wolf,et al.  Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids , 2019, IEEE Robotics and Automation Letters.

[41]  Kris M. Kitani,et al.  Forecasting Interactive Dynamics of Pedestrians with Fictitious Play , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Elena Corina Grigore,et al.  CoverNet: Multimodal Behavior Prediction Using Trajectory Sets , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Sergio Casas,et al.  IntentNet: Learning to Predict Intention from Raw Sensor Data , 2018, CoRL.

[44]  Yin Zhou,et al.  End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds , 2019, CoRL.

[45]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[46]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[47]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[48]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[49]  Alain L. Kornhauser,et al.  Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars , 2017, ArXiv.

[50]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Dinesh Manocha,et al.  AutonoVi-Sim: Autonomous Vehicle Simulation Platform with Weather, Sensing, and Traffic Control , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[52]  Sergey Levine,et al.  PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Christoph Stiller,et al.  Automated Driving in Uncertain Environments: Planning With Interaction and Uncertain Maneuver Prediction , 2018, IEEE Transactions on Intelligent Vehicles.

[54]  Xiaodong Liu,et al.  Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing , 2019, NAACL.

[55]  Ferenc Huszar,et al.  How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , 2015, ArXiv.

[56]  Jing Huang,et al.  Improving Rotated Text Detection with Rotation Region Proposal Networks , 2018, ArXiv.

[57]  Daniel Krajzewicz,et al.  SUMO - Simulation of Urban MObility An Overview , 2011 .

[58]  Xi Chen,et al.  Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[59]  Chung Choo Chung,et al.  Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[60]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Raquel Urtasun,et al.  End-to-end Contextual Perception and Prediction with Interaction Transformer , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[62]  Henggang Cui,et al.  Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[63]  Pieter Abbeel,et al.  An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.

[64]  Yoshua Bengio,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.