Whose Track Is It Anyway? Improving Robustness to Tracking Errors with Affinity-based Trajectory Prediction

Multi-agent trajectory prediction is critical for planning and decision-making in human-interactive autonomous systems, such as self-driving cars. However, most prediction models are developed separately from their upstream perception (detection and tracking) modules, assuming ground truth past trajectories as inputs. As a result, their performance degrades significantly when using real-world noisy tracking results as inputs. This is typically caused by the propagation of errors from tracking to prediction, such as noisy tracks, fragments and identity switches. To alleviate this propagation of errors, we propose a new prediction paradigm that uses detections and their affinity matrices across frames as inputs, removing the need for error- prone data association during tracking. Since affinity matrices contain “soft” information about the similarity and identity of detections across frames, making prediction directly from affinity matrices retains strictly more information than making prediction from the tracklets generated by data association. Experiments on large-scale, real-world autonomous driving datasets show that our affinity-based prediction scheme 11Our project website is at https://www.xinshuoweng.com/projects/Affinipred. reduces overall prediction errors by up to 57.9%, in comparison to standard prediction pipelines that use tracklets as inputs, with even more significant error reduction (up to 88.6%) if restricting the evaluation to challenging scenarios with tracking errors.

[1]  Marco Pavone,et al.  MTP: Multi-hypothesis Tracking and Prediction for Reduced Error Propagation , 2021, 2022 IEEE Intelligent Vehicles Symposium (IV).

[2]  Zihan Zhou,et al.  Towards Robust Human Trajectory Prediction in Raw Videos , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Andreas Zell,et al.  Score refinement for confidence-based 3D multi-object tracking , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Kris Kitani,et al.  Joint Object Detection and Multi-Object Tracking with Graph Neural Networks , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Laura Leal-Taixé,et al.  EagerMOT: 3D Multi-Object Tracking via Sensor Fusion , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Kris Kitani,et al.  PTP: Parallelized Tracking and Prediction With Graph Neural Networks and Diversity Sampling , 2021, IEEE Robotics and Automation Letters.

[7]  Kris Kitani,et al.  AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Bolei Zhou,et al.  Multimodal Motion Prediction with Stacked Transformers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Renjie Liao,et al.  LaneRCNN: Distributed Representations for Graph-Centric Motion Forecasting , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  R. Urtasun,et al.  LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Bodo Rosenhahn,et al.  Exploring Dynamic Context for Multi-path Trajectory Prediction , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Jean Oh,et al.  Trajformer: Trajectory Prediction with Local Self-Attentive Contexts for Autonomous Driving , 2020, ArXiv.

[13]  Qi Zhang,et al.  Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting , 2020, NeurIPS.

[14]  Micol Marchetti-Bowick,et al.  Map-Adaptive Goal-Based Trajectory Prediction , 2020, CoRL.

[15]  Peter Protzel,et al.  Factor Graph based 3D Multi-Object Tracking in Point Clouds , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  E. Nebot,et al.  Probabilistic Crowd GAN: Multimodal Pedestrian Trajectory Prediction Using a Graph Vehicle-Pedestrian Attention Network , 2020, IEEE Robotics and Automation Letters.

[17]  Kris Kitani,et al.  GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  R. Urtasun,et al.  PnPNet: End-to-End Perception and Prediction With Tracking in the Loop , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shuai Yi,et al.  Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction , 2020, ECCV.

[20]  Abduallah A. Mohamed,et al.  Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ming Liu,et al.  PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds , 2020, IEEE Robotics and Automation Letters.

[22]  Marco Pavone,et al.  Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data , 2020, ECCV.

[23]  David Held,et al.  3D Multi-Object Tracking: A Baseline and New Evaluation Metrics , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ruslan Salakhutdinov,et al.  Multiple Futures Prediction , 2019, NeurIPS.

[26]  Hui Zhou,et al.  Robust Multi-Modality Multi-Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Benjin Zhu,et al.  Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection , 2019, ArXiv.

[28]  Silvio Savarese,et al.  Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks , 2019, NeurIPS.

[29]  Sergey Levine,et al.  PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Marco Pavone,et al.  The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Paul Vernaza,et al.  r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting , 2018, ECCV.

[33]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Karl Granström,et al.  Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[38]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[40]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[41]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[43]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  Dani Lischinski,et al.  Crowds by Example , 2007, Comput. Graph. Forum.

[45]  Ingemar J. Cox,et al.  An Efficient Implementation of Reid's Multiple Hypothesis Tracking Algorithm and Its Evaluation for the Purpose of Visual Tracking , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  D. Reid An algorithm for tracking multiple targets , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.