Multi-Density Sketch-to-Image Translation Network

Sketch-to-image (S2I) translation plays an important role in image synthesis and manipulation tasks, such as photo editing and colorization. Some specific S2I translation including sketch-to-photo and sketch-to-painting can be used as powerful tools in the art design industry. However, previous methods only support S2I translation with a single level of density, which gives less flexibility to users for controlling the input sketches. In this work, we propose the first multi-level density sketch-to-image translation framework, which allows the input sketch to cover a wide range from rough object outlines to micro structures. Moreover, to tackle the problem of noncontinuous representation of multi-level density input sketches, we project the density level into a continuous latent space, which can then be linearly controlled by a parameter. This allows users to conveniently control the densities of input sketches and generation of images. Moreover, our method has been successfully verified on various datasets for different applications including face editing, multi-modal sketch-to-photo translation, and anime colorization, providing coarse-to-fine levels of controls to these applications.

[1]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Yen-Wei Chen,et al.  Deep Residual Network of Spectral and Spatial Fusion for Hyperspectral Image Super-Resolution , 2019, 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM).

[3]  Hiroshi Ishikawa,et al.  Mastering Sketching: Adversarial Augmentation for Structured Prediction , 2017 .

[4]  Seungyong Lee,et al.  Coherent line drawing , 2007, NPAR '07.

[5]  Ardeshir Goshtasby,et al.  On the Canny edge detector , 2001, Pattern Recognit..

[6]  Yonghong Hou,et al.  Single Image De-Raining via Generative Adversarial Nets , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[7]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[9]  Yan Yang,et al.  Embedding Non-Local Mean in Squeeze-and-Excitation Network for Single Image Deraining , 2019, 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[10]  Ratna Babu Chinnam,et al.  SPA-GAN: Spatial Attention GAN for Image-to-Image Translation , 2019, IEEE Transactions on Multimedia.

[11]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[12]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[13]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Nam Ik Cho,et al.  Deep Hierarchical Single Image Super-Resolution by Exploiting Controlled Diverse Context Features , 2019, 2019 IEEE International Symposium on Multimedia (ISM).

[15]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[17]  Qiang Peng,et al.  BranchGAN: Unsupervised Mutual Image-to-Image Transfer With A Single Encoder and Dual Decoders , 2019, IEEE Transactions on Multimedia.

[18]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Yu-Ding Lu,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2020, International Journal of Computer Vision.

[20]  Linlin Shen,et al.  SwitchGAN for Multi-domain Facial Image Translation , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[21]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Fumin Shen,et al.  Make a Face: Towards Arbitrary High Fidelity Face Manipulation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Meng Wang,et al.  Quality-Aware Unpaired Image-to-Image Translation , 2019, IEEE Transactions on Multimedia.

[25]  Jialu Huang,et al.  Semantic Example Guided Image-to-Image Translation , 2019, IEEE Transactions on Multimedia.

[26]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[27]  Yanli Liu,et al.  Asymmetric Joint GANs for Normalizing Face Illumination From a Single Image , 2020, IEEE Transactions on Multimedia.

[28]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[29]  Minjae Kim,et al.  U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation , 2019, ICLR.

[30]  Soo-Young Lee,et al.  Style-Controlled Synthesis of Clothing Segments for Fashion Image Manipulation , 2020, IEEE Transactions on Multimedia.

[31]  Wei Zhang,et al.  Everyone is a Cartoonist: Selfie Cartoonization with Attentive Adversarial Networks , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[32]  Long Chen,et al.  High-Resolution Driving Scene Synthesis Using Stacked Conditional Gans and Spectral Normalization , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[33]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Dorin Comaniciu,et al.  Mean shift analysis and applications , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[35]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[36]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.