论文信息 - Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension

Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension

This paper addresses the problem of vehicle re-identification using distance comparison of images in CNN latent spaces. First, we study the impact of the distance metrics, comparing performances obtained with different metrics: the minimal Euclidean distance (MED), the minimal cosine distance (MCD), and the residue of the sparse coding reconstruction (RSCR). These metrics are applied using features extracted through five different CNN architectures, namely ResNet18, AlexNet, VGG16, InceptionV3 and DenseNet201. We use the specific vehicle re-identification dataset VeRI to fine-tune these CNNs and evaluate results. In overall, independently from the CNN used, MCD outperforms MED, commonly used in the literature. Secondly, the state-of-the-art image-to-track process (I2TP) is extended to a track-to-track process (T2TP) without using complementary metadata. Metrics are extended to measure distance between tracks, enabling the evaluation of T2TP and comparison with I2TP using the same CNN models. Results show that T2TP outperforms I2TP for MCD and RSCR. T2TP combining DenseNet201 and MCD-based metrics exhibits the best performances, outperforming the state-of-the-art I2TP models that use complementary metadata. Finally, our experiments highlight two main results: i) the importance of the metric choice for vehicle re-identification, and ii) T2TP improves the performances compared to I2TP, especially when coupled with MCD-based metrics.

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Ji Wan,et al. Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[3] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4] Adam Herout,et al. Vehicle Re-identification for Automatic Video Traffic Surveillance , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Shengcai Liao,et al. Vehicle Re-Identification Using Quadruple Directional Deep Learning Features , 2018, IEEE Transactions on Intelligent Transportation Systems.

[8] Vipin Kumar,et al. Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data , 2003, SDM.

[9] Wu Liu,et al. Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[10] Tao Mei,et al. PROVID: Progressive and Multimodal Vehicle Reidentification for Large-Scale Urban Surveillance , 2018, IEEE Transactions on Multimedia.

[11] Adam Herout,et al. BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance , 2017, IEEE Transactions on Intelligent Transportation Systems.

[12] Suresh Venkatasubramanian,et al. A Gentle Introduction to the Kernel Distance , 2011, ArXiv.

[13] Bernhard Schölkopf,et al. The Kernel Trick for Distances , 2000, NIPS.

[14] Sharath Pankanti,et al. Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos , 2012, IEEE Transactions on Multimedia.

[15] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[16] Ling-Yu Duan,et al. Group-Sensitive Triplet Embedding for Vehicle Reidentification , 2018, IEEE Transactions on Multimedia.

[17] David Zhang,et al. A Survey of Sparse Representation: Algorithms and Applications , 2015, IEEE Access.

[18] Xiaogang Wang,et al. Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19] Liping Han,et al. Distance Weighted Cosine Similarity Measure for Text Classification , 2013, IDEAL.

[20] Ying Liu,et al. A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[21] M Swathy,et al. Survey on Vehicle Detection and Tracking Techniques in Video Surveillance , 2017 .

[22] Qi Tian,et al. SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Vipin Kumar,et al. Introduction to Data Mining, (First Edition) , 2005 .

[24] Jonghyun Choi,et al. Toward Sparse Coding on Cosine Distance , 2014, 2014 22nd International Conference on Pattern Recognition.

[25] Wei Zeng,et al. Exploiting Multi-grain Ranking Constraints for Precisely Searching Visually-similar Vehicles , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[27] Lin Feng,et al. Multi-view metric learning based on KL-divergence for similarity measurement , 2017, Neurocomputing.

[28] Jing Zhang,et al. Semantic Discriminative Metric Learning for Image Similarity Measurement , 2016, IEEE Transactions on Multimedia.

[29] Xiaoou Tang,et al. A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[31] Sultan Daud Khan,et al. A survey of advances in vision-based vehicle re-identification , 2019, Comput. Vis. Image Underst..

[32] Rodrigo Minetto,et al. A Two-Stream Siamese Neural Network for Vehicle Re-Identification by Using Non-Overlapping Cameras , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[33] Tao Mei,et al. A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance , 2016, ECCV.

[34] Changxin Gao,et al. Vehicle re-identification by fusing multiple deep neural networks , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[35] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.

[36] Daniel T. Larose,et al. Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[37] Jeremy S. Smith,et al. Vehicle re-identification in still images: Application of semi-supervised learning and re-ranking , 2019, Signal Process. Image Commun..

[38] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Shiliang Zhang,et al. RAM: A Region-Aware Deep Model for Vehicle Re-Identification , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[40] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[41] Michel Verleysen,et al. The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.

[42] Tiejun Huang,et al. Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Chris Yakopcic,et al. A State-of-the-Art Survey on Deep Learning Theory and Architectures , 2019, Electronics.

[44] Guillermo Sapiro,et al. Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[45] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46] Li Bai,et al. Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[47] Huibing Wang,et al. Learning multi-region features for vehicle re-identification with context-based ranking method , 2019, Neurocomputing.

[48] Florence Sèdes,et al. Toulouse campus surveillance dataset: scenarios, soundtracks, synchronized videos with overlapping and disjoint views , 2018, MMSys.