Multi-Camera Tracking By Candidate Intersection Ratio Tracklet Matching

Multi-camera vehicle tracking at the city scale is an essential task in traffic management for smart cities. Large-scale video analytics is challenging due to the vehicle variabilities, view variations, frequent occlusions, degraded pixel quality, and appearance differences. In this work, we develop a multi-target multi-camera (MTMC) vehicle tracking system based on a newly proposed Candidates Intersection Ratio (CIR) metric that can effectively evaluate vehicle tracklets for matching across views. Our system consists of four modules: (1) Faster-RCNN vehicle detection, (2) detection association based on re-identification feature matching, (3) single-camera tracking (SCT) to produce initial tracklets, (4) multi-camera vehicle tracklet matching and re-identification that creates longer, consistent tracklets across the city scale. Based on popular DNN object detection and SCT modules, we focus on the development of tracklet creation, association, and linking in SCT and MTMC. Specifically, SCT filters are proposed to effectively eliminate unreliable tracklets. The CIR metric improves robust vehicle tracklet linking across visually distinct views. Our system obtains IDF1 score of 0.1343 on the AI City 2021 Challenge Track 3 public leaderboard.

[1]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Liang Zheng,et al.  The 4th AI City Challenge , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Alexander G. Hauptmann,et al.  ELECTRICITY: An Efficient Multi-camera Vehicle Tracking System for Intelligent City , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Jenq-Neng Hwang,et al.  CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Xiaoou Tang,et al.  Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net , 2018, ECCV.

[9]  Qiang Ji,et al.  Robust Face Tracking via Collaboration of Generic and Specific Models , 2008, IEEE Transactions on Image Processing.

[10]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Michael Felsberg,et al.  Learning Spatially Regularized Correlation Filters for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[13]  Jenq-Neng Hwang,et al.  Single-Camera and Inter-Camera Vehicle Tracking and 3D Speed Estimation Based on Fusion of Visual and Semantic Features , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Ngai-Man Cheung,et al.  Efficient and Deep Person Re-identification Using Multi-level Similarity , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[16]  Jenq-Neng Hwang,et al.  Combined estimation of camera link models for human tracking across nonoverlapping cameras , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[18]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jenq-Neng Hwang,et al.  Multi-Camera Tracking of Vehicles based on Deep Features Re-ID and Trajectory-Based Camera Link Models , 2019, CVPR Workshops.

[20]  Wei Jiang,et al.  Multi-Domain Learning and Identity Mining for Vehicle Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).