DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion With Deep Association

In the recent literature, on the one hand, many 3D multi-object tracking (MOT) works have focused on tracking accuracy and neglected computation speed, commonly by designing rather complex cost functions and feature extractors. On the other hand, some methods have focused too much on computation speed at the expense of tracking accuracy. In view of these issues, this paper proposes a robust and fast camera-LiDAR fusion-based MOT method that achieves a good trade-off between accuracy and speed. Relying on the characteristics of camera and LiDAR sensors, an effective deep association mechanism is designed and embedded in the proposed MOT method. This association mechanism realizes tracking of an object in a 2D domain when the object is far away and only detected by the camera, and updating of the 2D trajectory with 3D information obtained when the object appears in the LiDAR field of view to achieve a smooth fusion of 2D and 3D trajectories. Extensive experiments based on the typical datasets indicate that our proposed method presents obvious advantages over the state-of-the-art MOT methods in terms of both tracking accuracy and processing speed. Our code is made publicly available for the benefit of the community.

[1]  Jenq-Neng Hwang,et al.  Track without Appearance: Learn Box and Tracklet Embedding with Local and Global Motion Patterns for Vehicle Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Kemiao Huang,et al.  Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Cheng Wang,et al.  Tracklet Proposal Network for Multi-Object Tracking on Point Clouds , 2021, IJCAI.

[4]  Laura Leal-Taixé,et al.  EagerMOT: 3D Multi-Object Tracking via Sensor Fusion , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Kris Kitani,et al.  PTP: Parallelized Tracking and Prediction With Graph Neural Networks and Diversity Sampling , 2021, IEEE Robotics and Automation Letters.

[6]  Trevor Darrell,et al.  Monocular Quasi-Dense 3D Object Tracking , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Cheng Wang,et al.  3D Multi-Object Tracking in Point Clouds Based on Prediction Confidence-Guided Data Association , 2021, IEEE Transactions on Intelligent Transportation Systems.

[8]  J. Beveridge,et al.  DEFT: Detection Embeddings for Tracking , 2021, ArXiv.

[9]  Jeannette Bohg,et al.  Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Javier Civera,et al.  DOT: Dynamic Object Tracking for Visual SLAM , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Philip H. S. Torr,et al.  HOTA: A Higher Order Metric for Evaluating Multi-object Tracking , 2020, International Journal of Computer Vision.

[12]  Philipp Krähenbühl,et al.  Center-based 3D Object Detection and Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Trevor Darrell,et al.  Quasi-Dense Similarity Learning for Multiple Object Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kris Kitani,et al.  GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xinggang Wang,et al.  FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking , 2020, International Journal of Computer Vision.

[16]  Weijing Shi,et al.  Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Silvio Savarese,et al.  JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Jie Li,et al.  Probabilistic 3D Multi-Object Tracking for Autonomous Driving , 2020, ArXiv.

[19]  Samir A. Rawashdeh,et al.  A Combined Vision-Based Multiple Object Tracking and Visual Odometry System , 2019, IEEE Sensors Journal.

[20]  B. Leibe,et al.  Track to Reconstruct and Reconstruct to Track , 2019, IEEE Robotics and Automation Letters.

[21]  Shengjin Wang,et al.  Towards Real-Time Multi-Object Tracking , 2019, ECCV.

[22]  Hui Zhou,et al.  Robust Multi-Modality Multi-Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  David Held,et al.  3D Multi-Object Tracking: A Baseline and New Evaluation Metrics , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[25]  Krzysztof Czarnecki,et al.  FANTrack: 3D Multi-Object Tracking with Feature Association Network , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[26]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Laura Leal-Taixé,et al.  Tracking Without Bells and Whistles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Andreas Geiger,et al.  MOTS: Multi-Object Tracking and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Tobias Senst,et al.  Extending IOU Based Multi-Object Tracking by Visual Information , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[31]  Javier Civera,et al.  DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes , 2018, IEEE Robotics and Automation Letters.

[32]  Tankut Acarman,et al.  A Lightweight Online Multiple Object Vehicle Tracking Method , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[33]  Raquel Urtasun,et al.  End-to-end Learning of Multi-sensor 3D Tracking by Detection , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Joseph Redmon,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[35]  Tao Mei,et al.  PROVID: Progressive and Multimodal Vehicle Reidentification for Large-Scale Urban Surveillance , 2018, IEEE Transactions on Multimedia.

[36]  K. Madhava Krishna,et al.  Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[38]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[40]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[41]  Bastian Leibe,et al.  Combined image- and world-space tracking in traffic scenes , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Yu-Wing Tai,et al.  Accurate Single Stage Detector Using Recurrent Rolling Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[44]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[45]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[47]  IEEE International Conference on Robotics and Automation, ICRA 2021, Xi'an, China, May 30 - June 5, 2021 , 2021, ICRA.