Online Multiple Object Tracking with Reid Feature Extraction Network and Similarity Matrix Network

In multiple object tracking(MOT), data association is a crucial part. By constructing similarity loss matrix for trajectories and detections, they can be matched correspondingly by using Hungarian matching algorithm. However, the similarity loss is often obtained by calculating the Euclidean distance or other handcraft distance metrics of features extracted between objects, which may not be robust enough, resulting in matching inaccuracy. In this paper, we propose a novel MOT method with applying deep learning to feature extraction and data association. We firstly design the appearance feature extraction network(AFN) to learn effective features by training it on a large-scale person re-identification dataset(reid). Then, we propose the similarity matrix estimation network (SMN) to obtain reliable similarity by training it on the public MOT Challenge dataset, MOT17. Additionally, the similarity matrix output of SMN includes the dummy objects, which are used to deal with the association problems of object missing and object appearance between frames. In the end, our proposed MOT method is evaluated on MOT15, MOT17 and ablation study is carried out.

[1]  Kwangjin Yoon,et al.  Online Multi-Object Tracking with Historical Appearance Matching and Scene Adaptive Detection Filtering , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[3]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[4]  Kwangjin Yoon,et al.  Online Multi-Object Tracking Using Selective Deep Appearance Matching , 2018, 2018 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia).

[5]  Shihong Lao,et al.  Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses , 2009, CVPR.

[6]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Volker Eiselein,et al.  Sequential sensor fusion combining probability hypothesis density and kernelized correlation filters for multi-object tracking in video data , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Thomas Brox,et al.  Motion Segmentation & Multiple Object Tracking by Correlation Co-Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Bonhwa Ku,et al.  Online multi-person tracking with two-stage data association and online appearance model learning , 2016, IET Comput. Vis..

[11]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Euntai Kim,et al.  Multiple Object Tracking via Feature Pyramid Siamese Networks , 2019, IEEE Access.

[13]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[15]  Volker Eiselein,et al.  Real-Time Multi-human Tracking Using a Probability Hypothesis Density Filter and Multiple Detectors , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Kwangjin Yoon,et al.  Data Association for Multi-Object Tracking via Deep Neural Networks , 2019, Sensors.

[18]  Yang Zhang,et al.  Heterogeneous Association Graph Fusion for Target Association in Multiple Object Tracking , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[20]  Francois Bremond,et al.  Multi-Object tracking using multi-channel part appearance representation , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[21]  Nathanael L. Baisa Robust Online Multi-target Visual Tracking using a HISP Filter with Discriminative Deep Appearance Learning , 2019, J. Vis. Commun. Image Represent..

[22]  Alex Bewley,et al.  Deep Cosine Metric Learning for Person Re-identification , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).