SwapGAN: A Multistage Generative Approach for Person-to-Person Fashion Style Transfer

Fashion style transfer has attracted significant attention because it both has interesting scientific challenges and it is also important to the fashion industry. This paper focuses on addressing a practical problem in fashion style transfer, person-to-person clothing swapping, which aims to visualize what the person would look like with the target clothes worn on another person instead of dressing them physically. This problem remains challenging due to varying pose deformations between different person images. In contrast to traditional nonparametric methods that blend or warp the target clothes for the reference person, in this paper we propose a multistage deep generative approach named SwapGAN that exploits three generators and one discriminator in a unified framework to fulfill the task end-to-end. The first and second generators are conditioned on a human pose map and a segmentation map, respectively, so that we can simultaneously transfer the pose style and the clothes style. In addition, the third generator is used to preserve the human body shape during the image synthesis process. The discriminator needs to distinguish two fake image pairs from the real image pair. The entire SwapGAN is trained by integrating the adversarial loss and the mask-consistency loss. The experimental results on the DeepFashion dataset demonstrate the improvements of SwapGAN over other existing approaches through both quantitative and qualitative evaluations. Moreover, we conduct ablation studies on SwapGAN and provide a detailed analysis about its effectiveness.

[1]  Michael J. Black,et al.  DRAPE , 2012, ACM Trans. Graph..

[2]  Hanjiang Lai,et al.  Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis , 2018, NeurIPS.

[3]  Ke Gong,et al.  Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[5]  Nikolay Jetchev,et al.  The Conditional Analogy GAN: Swapping Fashion Articles on People Images , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[6]  Yun Fu,et al.  Fashion Style Generator , 2017, IJCAI.

[7]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[8]  Licheng Yu,et al.  Detailed Garment Recovery from a Single-View Image , 2016, ArXiv.

[9]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[11]  Yejun Liu,et al.  Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach , 2017, AAAI.

[12]  Michael J. Black,et al.  ClothCap , 2017, ACM Trans. Graph..

[13]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[14]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[16]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Andrea Vedaldi,et al.  Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ke Lu,et al.  Fashion Parsing With Video Context , 2014, IEEE Transactions on Multimedia.

[19]  Muhammad Mobeen Movania,et al.  Depth image based cloth deformation for virtual try-on , 2013, SIGGRAPH '13.

[20]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[21]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[22]  Ugur Güdükbay,et al.  Real-time virtual fitting with body measurement and motion smoothing , 2014, Comput. Graph..

[23]  Svetlana Lazebnik,et al.  Where to Buy It: Matching Street Clothing Photos in Online Shops , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Nicu Sebe,et al.  Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[27]  Gerhard Reitmayr,et al.  Virtual Try-On through Image-Based Rendering , 2013, IEEE Transactions on Visualization and Computer Graphics.

[28]  Shuicheng Yan,et al.  Clothes Co-Parsing Via Joint Image Segmentation and Labeling With Application to Clothing Retrieval , 2016, IEEE Transactions on Multimedia.

[29]  Larry S. Davis,et al.  VITON: An Image-Based Virtual Try-on Network , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Liang Lin,et al.  Toward Characteristic-Preserving Image-based Virtual Try-On Network , 2018, ECCV.

[32]  Yongdong Zhang,et al.  Trip Outfits Advisor: Location-Oriented Clothing Recommendation , 2017, IEEE Transactions on Multimedia.

[33]  Rainer Stiefelhagen,et al.  Fashion Forward: Forecasting Visual Style in Fashion , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Yoshihiro Kanamori,et al.  Image-Based Virtual Try-On System with Garment Reshaping and Color Correction , 2016, Trans. Comput. Sci..

[37]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[38]  Luc Van Gool,et al.  Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Bo Zhao,et al.  Clothing Cosegmentation for Shopping Images With Cluttered Background , 2016, IEEE Transactions on Multimedia.

[40]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[41]  Lei Chen,et al.  Online Modeling of Esthetic Communities Using Deep Perception Graph Analytics , 2018, IEEE Transactions on Multimedia.

[42]  Jiebo Luo,et al.  Mining Fashion Outfit Composition Using an End-to-End Deep Learning Approach on Set Data , 2016, IEEE Transactions on Multimedia.

[43]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[44]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Tai-Jiang Mu,et al.  Image-based clothes changing system , 2017, Computational Visual Media.

[46]  Sanja Fidler,et al.  Be Your Own Prada: Fashion Synthesis with Structural Coherence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Stephen Lin,et al.  Image-based clothes animation for virtual fitting , 2012, SIGGRAPH Asia Technical Briefs.

[48]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[49]  Namil Kim,et al.  Pixel-Level Domain Transfer , 2016, ECCV.