Scribbler: Controlling Deep Image Synthesis with Sketch and Color

Several recent works have used deep convolutional networks to generate realistic imagery. These methods sidestep the traditional computer graphics rendering pipeline and instead generate imagery at the pixel level by learning from large collections of photos (e.g. faces or bedrooms). However, these methods are of limited utility because it is difficult for a user to control what the network produces. In this paper, we propose a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic cars, bedrooms, or faces. We demonstrate a sketch based image synthesis system which allows users to scribble over the sketch to indicate preferred color for objects. Our network can then generate convincing images that satisfy both the color and the sketch constraints of user. The network is feed-forward which allows users to see the effect of their edits in real time. We compare to recent work on sketch to image synthesis and show that our approach generates more realistic, diverse, and controllable outputs. The architecture is also effective at user-guided colorization of grayscale images.

[1]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[2]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[3]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[4]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[5]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[6]  Holger Winnemöller,et al.  XDoG: An eXtended difference-of-Gaussians compendium including advanced image stylization , 2012, Comput. Graph..

[7]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[8]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[11]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[12]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Photo clip art , 2007, ACM Trans. Graph..

[15]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[16]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[17]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[18]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[19]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[20]  Namil Kim,et al.  Pixel-Level Domain Transfer , 2016, ECCV.

[21]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[22]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[23]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[24]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[26]  Minh N. Do,et al.  Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[27]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[28]  Jian Sun,et al.  ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[30]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[31]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[32]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[33]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[34]  Leonard McMillan,et al.  Plenoptic Modeling: An Image-Based Rendering System , 2023 .

[35]  Tien-Tsin Wong,et al.  Manga colorization , 2006, ACM Trans. Graph..

[36]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[38]  K. Sasaki,et al.  Learning to simplify , 2016, ACM Trans. Graph..

[39]  Marcel van Gerven,et al.  Convolutional Sketch Inversion , 2016, ECCV Workshops.

[40]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .

[41]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[42]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[43]  Alex J. Champandard,et al.  Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks , 2016, ArXiv.

[44]  Abhinav Gupta,et al.  Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[45]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[46]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[47]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[50]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[51]  John Dingliana,et al.  LazyBrush: Flexible Painting Tool for Hand‐drawn Cartoons , 2009, Comput. Graph. Forum.

[52]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).