Cross-dataset person re-identification using deep convolutional neural networks: effects of context and domain adaptation

Over the past years, the impact of surveillance systems on public safety increases dramatically. One significant challenge in this domain is person re-identification, which aims to detect whether a person has already been captured by another camera in the surveillance network or not. Most of the work that has been conducted on person re-identification problem uses a single dataset, in which the training and test data are coming from the same source. However, as we have shown in this work, there is a strong bias among the person re-identification datasets, therefore, a method that has been trained and optimized on a specific person re-identification dataset may not generalize well and perform successfully on the other datasets. This is a problem for many real-world applications, since it is not feasible to collect and annotate sufficient amount of data from the target application to train or fine-tune a deep convolutional neural network model. Taking this issue into account, in this work, we have focused on cross-dataset person re-identification problem and first explored and analyzed in detail the use of the state-of-the-art deep convolutional neural network architectures, namely AlexNet, VGGNet, GoogLeNet, ResNet, and DenseNet that have been developed for generic image classification task. These deep CNN models have been adapted to the person re-identification domain by fine-tuning them for each human body part separately, as well as on the entire body, with the two relatively large person re-identification datasets: CUHK03 and Market-1501. Then, the performance of each adapted model has been evaluated on two different publicly available datasets: VIPeR and PRID2011. We have shown that, even just a domain adaptation leads comparable results to the state-of-the-art cross-dataset approaches. Another point that we have addressed in this paper is context adaptation. It has been known that person re-identification approaches implicitly utilizes background as context information. Therefore, to have a consistent background across different camera views, we have employed the cycle-consistent generative adversarial network. We have shown that this further improves the performance.

[1]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[2]  Jiwen Lu,et al.  Deep transfer metric learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Pong C. Yuen,et al.  Domain Transfer Support Vector Ranking for Person Re-identification without Target Camera Label Information , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[8]  Yi Yang,et al.  Unsupervised Person Re-identification , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[9]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Qiong Wu Multi-Scale Convolutional Network for Person Re-identification , 2017 .

[11]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Ping Li,et al.  Cross-Domain Person Reidentification Using Domain Adaptation Ranking SVMs , 2015, IEEE Transactions on Image Processing.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[15]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[16]  Shaogang Gong,et al.  Unsupervised Cross-Dataset Transfer Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[19]  Niall McLaughlin,et al.  Data-augmentation for reducing dataset bias in person re-identification , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[20]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Weizhi Xu,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline , 2019, Proceedings of the 2019 5th International Conference on Computer and Technology Applications.

[22]  Shengcai Liao,et al.  Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Barbara Caputo,et al.  Looking beyond appearances: Synthetic training data for deep CNNs in re-identification , 2017, Comput. Vis. Image Underst..

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[31]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[32]  Sambit Bakshi,et al.  Illumination and scale invariant relevant visual features with hypergraph-based learning for multi-shot person re-identification , 2017, Multimedia Tools and Applications.

[33]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[35]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[36]  Yang Hu,et al.  Cross Dataset Person Re-identification , 2014, ACCV Workshops.

[37]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[38]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Sambit Bakshi,et al.  A Neuromorphic Person Re-Identification Framework for Video Surveillance , 2017, IEEE Access.