Drowned out by the noise: Evidence for Tracking-free Motion Prediction

Autonomous driving consists of a multitude of interacting modules, where each module must contend with errors from the others. Typically, the motion prediction module depends on a robust tracking system to capture each agent’s past movement. In this work, we systematically explore the importance of the tracking module for the motion prediction task and ultimately conclude that the tracking module is detrimental to overall motion prediction performance when the module is imperfect (with as low as 1% error). We explicitly compare models that use tracking information to models that do not across multiple scenarios and conditions. We find that the tracking information only improves performance in noise-free conditions. A noise-free tracker is unlikely to remain noise-free in real-world scenarios, and the inevitable noise will subsequently negatively affect performance. We thus argue future work should be mindful of noise when developing and testing motion/tracking modules, or that they should do away with the tracking component entirely.

[1]  Rudolph van der Merwe,et al.  The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[2]  Jin Young Choi,et al.  Visual Path Prediction in Complex Scenes with Crowded Moving Objects , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Dinesh Manocha,et al.  TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Henggang Cui,et al.  Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[6]  Benjamin Sapp,et al.  MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction , 2019, CoRL.

[7]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[8]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[9]  Dragomir Anguelov,et al.  VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mohamed Chaabane,et al.  Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Marco Pavone,et al.  Trajectron++: Multi-Agent Generative Trajectory Forecasting With Heterogeneous Data for Control , 2020, ArXiv.

[12]  Brian C. Becker,et al.  MultiXNet: Multiclass Multistage Multimodal Motion Prediction , 2020, 2021 IEEE Intelligent Vehicles Symposium (IV).

[13]  Henggang Cui,et al.  Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets , 2019, 2020 IEEE Intelligent Vehicles Symposium (IV).

[14]  Abduallah A. Mohamed,et al.  Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Marco Cristani,et al.  Transformer Networks for Trajectory Forecasting , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[16]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  J. Ross Beveridge,et al.  A Pose Proposal and Refinement Network for Better 6D Object Pose Estimation , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Ying Nian Wu,et al.  Multi-Agent Tensor Fusion for Contextual Trajectory Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Alexander Carballo,et al.  A Survey of Autonomous Driving: Common Practices and Emerging Technologies , 2019, IEEE Access.

[20]  Sergio Casas,et al.  Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations , 2020, ECCV.

[21]  R. Urtasun,et al.  PnPNet: End-to-End Perception and Prediction With Tracking in the Loop , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Henggang Cui,et al.  Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Silvio Savarese,et al.  Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks , 2019, NeurIPS.

[24]  J. Ross Beveridge,et al.  DEFT: Detection Embeddings for Tracking , 2021, ArXiv.

[25]  S. State,et al.  A METHODOLOGY FOR AUTOMATED TRAJECTORY PREDICTION ANALYSIS , 2004 .

[26]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Mohan M. Trivedi,et al.  Trajectory analysis and prediction for improved pedestrian safety: Integrated framework and evaluations , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[29]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[30]  Karl-Heinz Hoffmann,et al.  Prediction of driver intended path at intersections , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[31]  Jie Li,et al.  Probabilistic 3D Multi-Object Tracking for Autonomous Driving , 2020, ArXiv.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Amir Rasouli,et al.  Deep Learning for Vision-based Prediction: A Survey , 2020, ArXiv.

[34]  Sergey Levine,et al.  PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Mohan M. Trivedi,et al.  Trajectory Prediction for Autonomous Driving based on Multi-Head Attention with Joint Agent-Map Representation , 2020 .

[36]  Luc Van Gool,et al.  Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Sammy Omari,et al.  One Thousand and One Hours: Self-driving Motion Prediction Dataset , 2020, CoRL.

[38]  Sergio Casas,et al.  IntentNet: Learning to Predict Intention from Raw Sensor Data , 2018, CoRL.

[39]  Bin Yang,et al.  Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.