Person image synthesis through siamese generative adversarial network

Abstract Photo-realistic image synthesis is an attracting idea for person re-identification (ReID) and data augmentation on human pose estimation. However, existing advances manipulating human image synthesis lack texture details for varying poses or appearances. This paper presents a person image synthesis Siamese generative adversarial network (PS2GAN), which re-synthesizes person image by changing the pose of that person to a given pose, modeled in a Siamese structure with image generative network and pair conditional discriminative networks in single-branch. For pose transfer, the proposed PS2GAN adopts Siamese structure consisting of two image generative networks and a novel contrastive-pose loss regularizing the generation process. Additionally, a nearest-neighbor loss computes the difference between fake and real images to make high-level information closer. Furthermore, the proposed PS2GAN is competitive to the state-of-the-art performance on Market-1501 and DeepFashion datasets via qualitatively and quantitatively comparing with prior works, and synthetic images of the PS2GAN can alleviate data insufficiency for person ReID.

[1]  Wei Wang,et al.  Multistage Adversarial Losses for Pose-Based Human Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Tao Xiang,et al.  Pose-Normalized Image Generation for Person Re-identification , 2017, ECCV.

[3]  Wei Chen,et al.  A Novel Hard Mining Center-Triplet Loss for Person Re-identification , 2019, PRCV.

[4]  Lin Yang,et al.  Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Tao Mei,et al.  Unsupervised Person Image Generation With Semantic Parsing Transformation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jiaya Jia,et al.  View Independent Generative Adversarial Network for Novel View Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Linlin Liu,et al.  Collocating Clothes With Generative Adversarial Networks Cosupervised by Categories and Attributes: A Multidiscriminator Framework , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Linlin Liu,et al.  ClothingOut: a category-supervised GAN model for clothing segmentation and retrieval , 2018, Neural Computing and Applications.

[10]  Guanghui Wang,et al.  Adversarially Approximated Autoencoder for Image Generation and Manipulation , 2019, IEEE Transactions on Multimedia.

[11]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Dong Liang,et al.  PCGAN: Partition-Controlled Human Image Generation , 2018, AAAI.

[13]  Björn Ommer,et al.  A Variational U-Net for Conditional Appearance and Shape Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  David S. Rosenblum,et al.  UrbanFM: Inferring Fine-Grained Urban Flows , 2019, KDD.

[15]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[16]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[18]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[19]  Gene Cheung,et al.  SiGAN: Siamese Generative Adversarial Network for Identity-Preserving Face Hallucination , 2018, IEEE Transactions on Image Processing.

[20]  Xiaogang Wang,et al.  Transductive Centroid Projection for Semi-supervised Large-Scale Recognition , 2018, ECCV.

[21]  Pourya Shamsolmoali,et al.  G-GANISR: Gradual generative adversarial network for image super resolution , 2019, Neurocomputing.

[22]  Meng Yang,et al.  Triple-translation GAN with multi-layer sparse representation for face image synthesis , 2019, Neurocomputing.

[23]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[27]  Luc Van Gool,et al.  Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Nicu Sebe,et al.  Appearance and Pose-Conditioned Human Image Generation Using Deformable GANs , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[30]  Nicu Sebe,et al.  Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Jie Zhou,et al.  Image generation via latent space learning using improved combination , 2019, Neurocomputing.

[32]  Guangming Shi,et al.  SISRSet: Single image super-resolution subjective evaluation test and objective quality assessment , 2019, Neurocomputing.

[33]  Gang Hua,et al.  CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[36]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[37]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[39]  Francesc Moreno-Noguer,et al.  Unsupervised Person Image Synthesis in Arbitrary Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[42]  Yu Tian,et al.  CR-GAN: Learning Complete Representations for Multi-view Generation , 2018, IJCAI.

[43]  Aykut Erdem,et al.  Generating person images based on attributes , 2018, 2018 26th Signal Processing and Communications Applications Conference (SIU).

[44]  Miao Yu,et al.  Progressive Pose Attention Transfer for Person Image Generation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jiachen Li,et al.  Text Guided Person Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.