Similarity-preserving Image-image Domain Adaptation for Person Re-identification

This article studies the domain adaptation problem in person re-identification (re-ID) under a "learning via translation" framework, consisting of two components, 1) translating the labeled images from the source to the target domain in an unsupervised manner, 2) learning a re-ID model using the translated images. The objective is to preserve the underlying human identity information after image translation, so that translated images with labels are effective for feature learning on the target domain. To this end, we propose a similarity preserving generative adversarial network (SPGAN) and its end-to-end trainable version, eSPGAN. Both aiming at similarity preserving, SPGAN enforces this property by heuristic constraints, while eSPGAN does so by optimally facilitating the re-ID model learning. More specifically, SPGAN separately undertakes the two components in the "learning via translation" framework. It first preserves two types of unsupervised similarity, namely, self-similarity of an image before and after translation, and domain-dissimilarity of a translated source image and a target image. It then learns a re-ID model using existing networks. In comparison, eSPGAN seamlessly integrates image translation and re-ID model learning. During the end-to-end training of eSPGAN, re-ID learning guides image translation to preserve the underlying identity information of an image. Meanwhile, image translation improves re-ID learning by providing identity-preserving training samples of the target domain style. In the experiment, we show that identities of the fake images generated by SPGAN and eSPGAN are well preserved. Based on this, we report the new state-of-the-art domain adaptation results on two large-scale person re-ID datasets.

[1]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[2]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[3]  Shin'ichi Satoh,et al.  Cascaded SR-GAN for Scale-Adaptive Low Resolution Person Re-identification , 2018, IJCAI.

[4]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[5]  Shaogang Gong,et al.  Deep Low-Resolution Person Re-Identification , 2018, AAAI.

[6]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[7]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[8]  Philip S. Yu,et al.  Transfer Sparse Coding for Robust Image Representation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[11]  Andreas Dengel,et al.  What do Deep Networks Like to See? , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Tao Xiang,et al.  Unsupervised Learning of Generative Topic Saliency for Person Re-identification , 2014, BMVC.

[14]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Huchuan Lu,et al.  Stepwise Metric Promotion for Unsupervised Video Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[17]  Xin Yu,et al.  Identity-Preserving Face Recovery from Stylized Portraits , 2019, International Journal of Computer Vision.

[18]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[21]  Dong Xu,et al.  Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Liang Zheng,et al.  Unsupervised Person Re-identification: Clustering and Fine-tuning , 2017 .

[23]  Jiaying Liu,et al.  Demystifying Neural Style Transfer , 2017, IJCAI.

[24]  Yi Yang,et al.  Generalizing a Person Retrieval Model Hetero- and Homogeneously , 2018, ECCV.

[25]  Wei Liu,et al.  Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[29]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[30]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[33]  Martial Hebert,et al.  Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Silvio Savarese,et al.  Learning Transferrable Representations for Unsupervised Domain Adaptation , 2016, NIPS.

[35]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[36]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[37]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[38]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[39]  Bernt Schiele,et al.  Transfer Learning in a Transductive Setting , 2013, NIPS.

[40]  Wei-Shi Zheng,et al.  Cross-View Asymmetric Metric Learning for Unsupervised Person Re-Identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  François Laviolette,et al.  Domain-Adversarial Neural Networks , 2014, ArXiv.

[42]  Yi Yang,et al.  Camera Style Adaptation for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[44]  Bingpeng Ma,et al.  Covariance descriptor based on bio-inspired features for person re-identification and face verification , 2014, Image Vis. Comput..

[45]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Bingbing Ni,et al.  Pose Transferrable Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Wei Li,et al.  Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Yi Yang,et al.  Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yi Yang,et al.  Unsupervised Person Re-identification , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[55]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Pong C. Yuen,et al.  Dynamic Label Graph Matching for Unsupervised Video Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[59]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[60]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[61]  Shaogang Gong,et al.  Unsupervised Cross-Dataset Transfer Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[63]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[64]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[65]  Ming-Hsuan Yang,et al.  Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[67]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[68]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[69]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[70]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[71]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[72]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[73]  Yu Wu,et al.  Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74]  Lior Wolf,et al.  One-Sided Unsupervised Domain Mapping , 2017, NIPS.

[75]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[76]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[77]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Mark W. Schmidt,et al.  Fast Patch-based Style Transfer of Arbitrary Style , 2016, ArXiv.

[79]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.