Reference-Based Sketch Image Colorization Using Augmented-Self Reference and Dense Semantic Correspondence

This paper tackles the automatic colorization task of a sketch image given an already-colored reference image. Colorizing a sketch image is in high demand in comics, animation, and other content creation applications, but it suffers from information scarcity of a sketch image. To address this, a reference image can render the colorization process in a reliable and user-driven manner. However, it is difficult to prepare for a training data set that has a sufficient amount of semantically meaningful pairs of images as well as the ground truth for a colored image reflecting a given reference (e.g., coloring a sketch of an originally blue car given a reference green car). To tackle this challenge, we propose to utilize the identical image with geometric distortion as a virtual reference, which makes it possible to secure the ground truth for a colored output image. Furthermore, it naturally provides the ground truth for dense semantic correspondence, which we utilize in our internal attention mechanism for color transfer from reference to sketch input. We demonstrate the effectiveness of our approach in various types of sketch image colorization via quantitative as well as qualitative evaluation against existing methods.

[1]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  James Hays,et al.  SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[7]  Bernhard Schölkopf,et al.  EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Aurélie Bugeau,et al.  Variational Exemplar-Based Image Colorization , 2014, IEEE Transactions on Image Processing.

[10]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  Sai-Keung Wong,et al.  Adversarial Colorization of Icons Based on Contour and Color Conditions , 2019, ACM Multimedia.

[13]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yu-Ding Lu,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2020, International Journal of Computer Vision.

[15]  Sudeep D. Thepade,et al.  Color Traits Transfer to Grayscale Images , 2008, 2008 First International Conference on Emerging Trends in Engineering and Technology.

[16]  Hongan Wang,et al.  SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Faisal Z. Qureshi,et al.  EdgeConnect: Structure Guided Image Inpainting using Edge Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[21]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[23]  Jaegul Choo,et al.  Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[25]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[26]  Holger Winnemöller,et al.  XDoG: An eXtended difference-of-Gaussians compendium including advanced image stylization , 2012, Comput. Graph..

[27]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[28]  Deepu Rajan,et al.  Image colorization using similar images , 2012, ACM Multimedia.

[29]  Alexei A. Efros,et al.  Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[30]  Anurag Mittal,et al.  A Zero-Shot Framework for Sketch-based Image Retrieval , 2018, ECCV.

[31]  Chi-Keung Tang,et al.  Image Generation from Sketch Constraint Using Contextual GAN , 2017, ECCV.

[32]  Eunhyeok Park,et al.  Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Bernhard Schölkopf,et al.  Automatic Image Colorization Via Multimodal Predictions , 2008, ECCV.

[34]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[35]  Silvio Savarese,et al.  Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[36]  Dongdong Chen,et al.  Deep exemplar-based colorization , 2018, ACM Trans. Graph..

[37]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[38]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Jean Ponce,et al.  SPair-71k: A Large-scale Benchmark for Semantic Correspondence , 2019, ArXiv.

[41]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.