Graph-Induced Contrastive Learning for Intra-Camera Supervised Person Re-Identification

Intra-camera supervision (ICS) for person re-identification (Re-ID) assumes that identity labels are independently annotated within each camera view and no inter-camera identity association is labeled. It is a new setting proposed recently to reduce the burden of annotation while expect to maintain desirable Re-ID performance. However, the lack of inter-camera labels makes the ICS Re-ID problem much more challenging than the fully supervised counterpart. By investigating the characteristics of ICS, this article proposes a graph-induced contrastive learning (GCL) approach to address this issue. More specifically, we first formulate the cross-camera ID association task as a graph partitioning problem subjected to ICS-specific constraints and design a greedy agglomeration algorithm to solve it. Then, we propose a graph-induced contrastive loss that unifies both intra- and inter-camera learning into a contrastive learning framework to learn a Re-ID model. The cross-camera ID association step and the Re-ID model contrastive learning step are alternatively iterated, by which we progressively obtain a highly discriminative Re-ID model. Extensive experiments on three large-scale datasets show that our approach outperforms all previous ICS works. Especially, it gains 15.7% Rank-1 and 14.3% mAP improvements on the challenging MSMT17 dataset. Moreover, our approach performs even comparable to state-of-the-art fully supervised methods on all of the three datasets.

[1]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Houqiang Li,et al.  In Defense of the Classification Loss for Person Re-Identification , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Huchuan Lu,et al.  Stepwise Metric Promotion for Unsupervised Video Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Andrea Cavallaro,et al.  Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[6]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Lu Wang,et al.  How to partition a billion-node graph , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[9]  Yinghuan Shi,et al.  Progressive Cross-Camera Soft-Label Learning for Semi-Supervised Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Wei Jiang,et al.  Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Xian-Sheng Hua,et al.  Towards Precise Intra-camera Supervised Person Re-Identification , 2020, IEEE Workshop/Winter Conference on Applications of Computer Vision.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Mang Ye,et al.  Augmentation Invariant and Instance Spreading Feature for Softmax Embedding , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Shaogang Gong,et al.  Intra-Camera Supervised Person Re-Identification , 2020, International Journal of Computer Vision.

[15]  Yunchao Wei,et al.  Self-Similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-Identification , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Yinghuan Shi,et al.  Adversarial Camera Alignment Network for Unsupervised Cross-Camera Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Hongsheng Li,et al.  Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID , 2020, NeurIPS.

[18]  Shengcai Liao,et al.  Unsupervised Graph Association for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Ling Shao,et al.  Deep Learning for Person Re-Identification: A Survey and Outlook , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Shiliang Zhang,et al.  Unsupervised Person Re-Identification via Multi-Label Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[22]  Peter Sanders,et al.  High quality graph partitioning , 2012, Graph Partitioning and Graph Clustering.

[23]  Yu Wu,et al.  Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Yu Wu,et al.  Progressive Learning for Person Re-Identification With One Example , 2019, IEEE Transactions on Image Processing.

[25]  Yi Yang,et al.  A Bottom-Up Clustering Approach to Unsupervised Person Re-Identification , 2019, AAAI.

[26]  Shaogang Gong,et al.  Intra-Camera Supervised Person Re-Identification: A New Benchmark , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[27]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[28]  Huaping Liu,et al.  Unsupervised Representation Learning by InvariancePropagation , 2020, NeurIPS.

[29]  Junnan Li,et al.  Prototypical Contrastive Learning of Unsupervised Representations , 2020, ICLR.

[30]  Zhiming Luo,et al.  Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Yang Yang,et al.  ABD-Net: Attentive but Diverse Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[35]  Yunchao Wei,et al.  STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification , 2018, AAAI.

[36]  Wei-Shi Zheng,et al.  Deep Semi-Supervised Person Re-Identification with External Memory , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[37]  Yi Yang,et al.  Learning to Adapt Invariance in Memory for Person Re-Identification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[39]  Alan F. Smeaton,et al.  Contrastive Representation Learning: A Framework and Review , 2020, IEEE Access.

[40]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Chengxu Zhuang,et al.  Local Aggregation for Unsupervised Learning of Visual Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Dapeng Chen,et al.  Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification , 2020, ICLR.

[43]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Pong C. Yuen,et al.  Dynamic Label Graph Matching for Unsupervised Video Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Ming Tang,et al.  Identity-Guided Human Semantic Parsing for Person Re-Identification , 2020, ECCV.

[47]  Changxin Gao,et al.  Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians , 2020, ECCV.

[48]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[49]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[50]  Andrew Zisserman,et al.  End-to-End Learning of Visual Representations From Uncurated Instructional Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.