Unsupervised Pre-training for Person Re-identification

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation. This is to address the problem that all existing person Re-ID datasets are all of limited scale due to the costly effort required for data annotation. Previous research tries to leverage models pre-trained on ImageNet to mitigate the shortage of person Re-ID data but suffers from the large domain gap between ImageNet and person Re-ID data. LUPerson is an unlabeled dataset of 4M images of over 200K identities, which is 30× larger than the largest existing Re-ID dataset. It also covers a much diverse range of capturing environments (e.g., camera settings, scenes, etc.). Based on this dataset, we systematically study the key factors for learning Re-ID features from two perspectives: data augmentation and contrastive loss. Unsupervised pre-training performed on this large-scale dataset effectively leads to a generic Re-ID feature that can benefit all existing person Re-ID methods. Using our pre-trained model in some basic frameworks, our methods achieve state-of-the-art results without bells and whistles on four widely used Re-ID datasets: CUHK03, Market1501, DukeMTMC, and MSMT17. Our results also show that the performance improvement is more significant on small-scale target datasets or under few-shot setting.

[1]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[2]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Chunxiao Liu,et al.  Person re-identification by manifold ranking , 2013, 2013 IEEE International Conference on Image Processing.

[4]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Ziyan Wu,et al.  A Comprehensive Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets , 2016, ArXiv.

[9]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[12]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Xiaogang Wang,et al.  Person Re-identification with Deep Similarity-Guided Graph Neural Network , 2018, ECCV.

[14]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Nicu Sebe,et al.  Group Consistent Similarity Learning via Deep CRF for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Tao Mei,et al.  Part-Aligned Bilinear Representations for Person Re-identification , 2018, ECCV.

[17]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[19]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.

[20]  Yu Wu,et al.  Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Yi Yang,et al.  A Bottom-Up Clustering Approach to Unsupervised Person Re-Identification , 2019, AAAI.

[22]  Zuozhuo Dai,et al.  Batch DropBlock Network for Person Re-Identification and Beyond , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Wenjun Zeng,et al.  Densely Semantically Aligned Person Re-Identification , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Andrea Cavallaro,et al.  Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Christian Poellabauer,et al.  Second-Order Non-Local Attention Networks for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Jiwen Lu,et al.  Self-Critical Attention Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Jianyuan Guo,et al.  Beyond Human Parts: Dual Part-Aligned Representations for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Shiguang Shan,et al.  Interaction-And-Aggregation Network for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[32]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[33]  Wu Liu,et al.  Guided Saliency Feature Learning for Person Re-identification in Crowded Scenes , 2020, ECCV.

[34]  Ming Tang,et al.  Identity-Guided Human Semantic Parsing for Person Re-Identification , 2020, ECCV.

[35]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[36]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[37]  Pierre H. Richemond,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[38]  Geoffrey E. Hinton,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[39]  Gang Hua,et al.  Improving Person Re-Identification With Iterative Impression Aggregation , 2020, IEEE Transactions on Image Processing.

[40]  Wei Jiang,et al.  A Strong Baseline and Batch Normalization Neck for Deep Person Re-Identification , 2019, IEEE Transactions on Multimedia.

[41]  Xin Jin,et al.  Semantics-Aligned Representation Learning for Person Re-identification , 2019, AAAI.

[42]  Bumsub Ham,et al.  Relation Network for Person Re-identification , 2019, AAAI.

[43]  Tao Mei,et al.  FastReID: A Pytorch Toolbox for General Instance Re-identification , 2020, ArXiv.

[44]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[45]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Hongsheng Li,et al.  Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID , 2020, NeurIPS.