SwapNet: Image Based Garment Transfer

We present Swapnet, a framework to transfer garments across images of people with arbitrary body pose, shape, and clothing. Garment transfer is a challenging task that requires (i) disentangling the features of the clothing from the body pose and shape and (ii) realistic synthesis of the garment texture on the new body. We present a neural network architecture that tackles these sub-problems with two task-specific sub-networks. Since acquiring pairs of images showing the same clothing on different bodies is difficult, we propose a novel weakly-supervised approach that generates training pairs from a single image via data augmentation. We present the first fully automatic method for garment transfer in unconstrained images without solving the difficult 3D reconstruction problem. We demonstrate a variety of transfer results and highlight our advantages over traditional image-to-image and analogy pipelines.

[1]  Zengchang Qin,et al.  Auto-painter: Cartoon image generation from sketch by using conditional Wasserstein generative adversarial networks , 2018, Neurocomputing.

[2]  Zhe Gan,et al.  Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.

[3]  Luc Van Gool,et al.  Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Hua Wang,et al.  Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks , 2017, ArXiv.

[7]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[8]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[9]  Ke Gong,et al.  Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Lin Wang,et al.  Garment modeling with a depth camera , 2015, ACM Trans. Graph..

[12]  Gang Hua,et al.  Visual attribute transfer through deep image analogy , 2017, ACM Trans. Graph..

[13]  Sanja Fidler,et al.  Be Your Own Prada: Fashion Synthesis with Structural Coherence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Stephen Lin,et al.  Image-based clothes animation for virtual fitting , 2012, SIGGRAPH Asia Technical Briefs.

[15]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[16]  Licheng Yu,et al.  Detailed Garment Recovery from a Single-View Image , 2016, ArXiv.

[17]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[19]  Nikolay Jetchev,et al.  The Conditional Analogy GAN: Swapping Fashion Articles on People Images , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[20]  Peter V. Gehler,et al.  A Generative Model of People in Clothing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Cordelia Schmid,et al.  LCR-Net: Localization-Classification-Regression for Human Pose , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[24]  Xiaoming Liu,et al.  Coefficients Pose-Variant Input Recogni 8 on Engine Frontalized Output Generator FF-GAN D Discriminator Extreme Pose Input Frontalized Output , 2017 .

[25]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Larry S. Davis,et al.  VITON: An Image-Based Virtual Try-on Network , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[28]  Cordelia Schmid,et al.  Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Michael J. Black,et al.  ClothCap , 2017, ACM Trans. Graph..

[31]  Namil Kim,et al.  Pixel-Level Domain Transfer , 2016, ECCV.

[32]  Fisher Yu,et al.  TextureGAN: Controlling Deep Image Synthesis with Texture Patches , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[34]  Yuting Zhang,et al.  Deep Visual Analogy-Making , 2015, NIPS.

[35]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[36]  Bo Zhao,et al.  Multi-View Image Generation from a Single-View , 2017, ACM Multimedia.

[37]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Aykut Erdem,et al.  Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[39]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[40]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[41]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.