Weakly Supervised Discriminative Feature Learning With State Information for Person Identification

Unsupervised learning of identity-discriminative visual feature is appealing in real-world tasks where manual labelling is costly. However, the images of an identity can be visually discrepant when images are taken under different \emph{states}, e.g. different camera views and poses. This visual discrepancy leads to great difficulty in unsupervised discriminative learning. Fortunately, in real-world tasks we could often know the states without human annotation, e.g. we can easily have the camera view labels in person re-identification and facial pose labels in face recognition. In this work we propose utilizing the state information as weak supervision to address the visual discrepancy caused by different states. We formulate a simple pseudo label model and utilize the state information in an attempt to refine the assigned pseudo labels by the weakly supervised decision boundary rectification and weakly supervised feature drift regularization. We evaluate our model on unsupervised person re-identification and pose-invariant face recognition. Despite the simplicity of our method, it could outperform the state-of-the-art results on Duke-reID, MultiPIE and CFP datasets with a standard ResNet-50 backbone. We also find our model could perform comparably with the standard supervised fine-tuning results on the three datasets. Code is available at \url{https://github.com/KovenYu/state-information}.

[1]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[3]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Xiaoming Liu,et al.  Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition , 2017, IEEE Transactions on Image Processing.

[5]  Yi Yang,et al.  Learning to Adapt Invariance in Memory for Person Re-Identification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andrea Vedaldi,et al.  Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[8]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[9]  Dacheng Tao,et al.  A Comprehensive Survey on Pose-Invariant Face Recognition , 2015, ACM Trans. Intell. Syst. Technol..

[10]  Tsuhan Chen,et al.  Pose invariant face recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[11]  Ming Shao,et al.  Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[13]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[14]  Jian Cheng,et al.  NormFace: L2 Hypersphere Embedding for Face Verification , 2017, ACM Multimedia.

[15]  Wei Li,et al.  Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Dimitris N. Metaxas,et al.  Reconstruction-Based Disentanglement for Pose-Invariant Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[18]  Yinghuan Shi,et al.  A Novel Unsupervised Camera-Aware Domain Adaptation Framework for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Carlos D. Castillo,et al.  Triplet probabilistic embedding for face verification and clustering , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[20]  Tieniu Tan,et al.  Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Du-Sik Park,et al.  Rotating your face using multi-task deep neural network , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Wei-Shi Zheng,et al.  Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[25]  Yu-Chiang Frank Wang,et al.  Cross-Dataset Person Re-Identification via Unsupervised Pose Disentanglement and Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Xiang Li,et al.  Cross-Scenario Transfer Person Reidentification , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Jian-Huang Lai,et al.  Unsupervised Person Re-Identification by Camera-Aware Similarity Consistency Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[29]  Xiaogang Wang,et al.  Deep Learning Identity-Preserving Face Space , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Zhiming Luo,et al.  Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Xiaogang Wang,et al.  Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations , 2014, NIPS.

[32]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[33]  Shengcai Liao,et al.  Unsupervised Graph Association for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[35]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[37]  Wei-Shi Zheng,et al.  Unsupervised Person Re-Identification by Deep Asymmetric Metric Embedding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[39]  Shiguang Shan,et al.  Stacked Progressive Auto-Encoders (SPAE) for Face Recognition Across Poses , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[41]  Tao Xiang,et al.  Deep Transfer Learning for Person Re-Identification , 2016, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).

[42]  Yang Hua,et al.  Cross-View Discriminative Feature Learning for Person Re-Identification , 2018, IEEE Transactions on Image Processing.

[43]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Jian-Huang Lai,et al.  Supplementary Material for “Unsupervised Person Re-identification by Soft Multilabel Learning” , 2019 .

[45]  Allan Jabri,et al.  Learning Visual Features from Large Weakly Supervised Data , 2015, ECCV.

[46]  Yunchao Wei,et al.  Self-Similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-Identification , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[49]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Tinne Tuytelaars,et al.  Weakly supervised object detection with convex clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[52]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Paolo Favaro,et al.  Boosting Self-Supervised Learning via Knowledge Transfer , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Jian-Huang Lai,et al.  Person Re-Identification by Camera Correlation Aware Feature Augmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Wei-Shi Zheng,et al.  Distilled Person Re-Identification: Towards a More Scalable System , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Chunhua Shen,et al.  Self-Training With Progressive Augmentation for Unsupervised Cross-Domain Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[57]  Shaogang Gong,et al.  Unsupervised Person Re-identification by Deep Learning Tracklet Association , 2018, ECCV.

[58]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Liang Zheng,et al.  Unsupervised Person Re-identification: Clustering and Fine-tuning , 2017 .

[60]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[61]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[62]  Xiaoming Liu,et al.  Coefficients Pose-Variant Input Recogni 8 on Engine Frontalized Output Generator FF-GAN D Discriminator Extreme Pose Input Frontalized Output , 2017 .

[63]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[65]  Björn Ommer,et al.  CliqueCNN: Deep Unsupervised Exemplar Learning , 2016, NIPS.

[66]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[68]  Wei-Shi Zheng,et al.  Cross-View Asymmetric Metric Learning for Unsupervised Person Re-Identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[69]  Yi Yang,et al.  Camera Style Adaptation for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[71]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[72]  Alexei A. Efros,et al.  Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Ming Shao,et al.  Person Re-Identification by Cross-View Multi-Level Dictionary Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Fang Zhao,et al.  Towards Pose Invariant Face Recognition in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75]  Yunchao Wei,et al.  STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[77]  Yi Yang,et al.  Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[78]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[79]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Carlos D. Castillo,et al.  Frontal to profile face verification in the wild , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[81]  Yi Yang,et al.  Generalizing a Person Retrieval Model Hetero- and Homogeneously , 2018, ECCV.