Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data

Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task to generate images with detailed conditioning on object shapes. Existing methods for conditional image generation use category labels and/or keypoints and are only give limited control over object categories. In this work, we present SCGAN, an architecture to generate images with a desired shape specified by an input normal map. The shape-conditioned image generation task is achieved by explicitly modeling the image appearance via a latent appearance vector. The network is trained using unpaired training samples of real images and rendered normal maps. This approach enables us to generate images of arbitrary object categories with the target shape and diverse image appearances. We show the effectiveness of our method through both qualitative and quantitative evaluation on training data generation tasks.

[1]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Kiyoshi Tanaka,et al.  ArtGAN: Artwork synthesis with conditional categorical GANs , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[3]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[6]  Kate Saenko,et al.  From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[7]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[8]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[9]  Leonidas J. Guibas,et al.  ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.

[10]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[11]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[14]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[16]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Leon Sixt,et al.  RenderGAN: Generating Realistic Labeled Data , 2016, Front. Robot. AI.

[19]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Peter Robinson,et al.  Learning an appearance-based gaze estimator from one million synthesised images , 2016, ETRA.

[22]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[23]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[24]  Michael J. Black,et al.  ClothCap , 2017, ACM Trans. Graph..

[25]  Antonio M. López,et al.  Virtual and Real World Adaptation for Pedestrian Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[29]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[30]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[31]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[32]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[33]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[34]  Jost Tobias Springenberg,et al.  Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[35]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[36]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[37]  Vincent Lepetit,et al.  Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.

[38]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.