Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling

Multi-pedestrian trackers perform well when targets are clearly visible making the association task quite easy. However, when heavy occlusions are present, a mechanism to re-identify persons is needed. The common approach is to extract visual features from new detections and compare them with the features of previously found tracks. Since those detections can have substantial overlaps with nearby targets – especially in crowded scenarios – the extracted features are insufficient for a reliable re-identification. In contrast, we propose a novel occlusion handling strategy that explicitly models the relation between occluding and occluded tracks outperforming the feature-based approach, while not depending on a separate re-identification network. Furthermore, we improve the track management of a regression-based method in order to bypass missing detections and to deal with tracks leaving the scene at the border of the image. Finally, we apply our tracker in both temporal directions and merge tracklets belonging to the same target, which further enhances the performance. We demonstrate the effectiveness of our tracking components with ablative experiments and surpass the state-of-the-art methods on the three popular pedestrian tracking benchmarks MOT16, MOT17, and MOT20.

[1]  Laura Leal-Taix'e,et al.  Learning a Neural Solver for Multiple Object Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Wei Wu,et al.  Multi-Object Tracking with Multiple Cues and Switcher-Aware Classification , 2019, ArXiv.

[4]  Rainer Stiefelhagen,et al.  Multiple Object Tracking Performance Metrics and Evaluation in a Smart Room Environment , 2006 .

[5]  Jun Zhao,et al.  MAT: Motion-Aware Multi-Object Tracking , 2020, Neurocomputing.

[6]  Daniel Cremers,et al.  MOT20: A benchmark for multi object tracking in crowded scenes , 2020, ArXiv.

[7]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[8]  Jiri Matas,et al.  Forward-Backward Error: Automatic Detection of Tracking Failures , 2010, 2010 20th International Conference on Pattern Recognition.

[9]  Jian Wang,et al.  TPM: Multiple object tracking with tracklet-plane matching , 2020, Pattern Recognit..

[10]  Kwangjin Yoon,et al.  Online Multi-Object Tracking With GMPHD Filter and Occlusion Group Management , 2019, IEEE Access.

[11]  Cewu Lu,et al.  TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[13]  Vladlen Koltun,et al.  Tracking Objects as Points , 2020, ECCV.

[14]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[15]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[16]  Shengjin Wang,et al.  Towards Real-Time Multi-Object Tracking , 2019, ECCV.

[17]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[18]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  K. Madhava Krishna,et al.  Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Zhang Xiong,et al.  Long-Term Tracking With Deep Tracklet Association , 2020, IEEE Transactions on Image Processing.

[22]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Laura Leal-Taixé,et al.  Tracking Without Bells and Whistles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xu Gao,et al.  OSMO: Online Specific Models for Occlusion in Multiple Object Tracking under Surveillance Scene , 2018, ACM Multimedia.

[27]  Zhang Xiong,et al.  Multiplex Labeling Graph for Near-Online Tracking in Crowded Scenes , 2020, IEEE Internet of Things Journal.

[28]  Andrew Zisserman,et al.  Detect to Track and Track to Detect , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Guna Seetharaman,et al.  Multi-object Tracking Cascade with Multi-Step Data Association and Occlusion Handling , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[30]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Bin Liu,et al.  GSM: Graph Similarity Model for Multi-Object Tracking , 2020, IJCAI.

[32]  Yong Yan,et al.  Multi-Target Tracking with Trajectory Prediction and Re-Identification , 2019, 2019 Chinese Automation Congress (CAC).

[33]  Bodo Rosenhahn,et al.  Lifted Disjoint Paths with Application in Multiple Object Tracking , 2020, ICML.

[34]  Tobias Senst,et al.  Extending IOU Based Multi-Object Tracking by Visual Information , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[35]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yue Cao,et al.  Spatial-Temporal Relation Networks for Multi-Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Jenq-Neng Hwang,et al.  Exploit the Connectivity: Multi-Object Tracking with TrackletNet , 2018, ACM Multimedia.

[38]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Xavier Alameda-Pineda,et al.  How to Train Your Deep Multi-Object Tracker , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.