Unsupervised Person Re-identification: Clustering and Fine-tuning

The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this paper, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between 1) pedestrian clustering and 2) fine-tuning of the convolutional neural network (CNN) to improve the original model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning when the model is weak, CNN is fine-tuned on a small amount of reliable examples which locate near to cluster centroids in the feature space. As the model becomes stronger in subsequent iterations, more images are being adaptively selected as CNN training samples. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[3]  Tao Xiang,et al.  Deep Transfer Learning for Person Re-Identification , 2016, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).

[4]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Xiaogang Wang,et al.  End-to-End Deep Learning for Person Search , 2016, ArXiv.

[6]  Shiguang Shan,et al.  Self-Paced Learning with Diversity , 2014, NIPS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[10]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[13]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[14]  Deyu Meng,et al.  Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.