Manga filling style conversion with screentone variational autoencoder

Western color comics and Japanese-style screened manga are two popular comic styles. They mainly differ in the style of region-filling. However, the conversion between the two region-filling styles is very challenging, and manually done currently. In this paper, we identify that the major obstacle in the conversion between the two filling styles stems from the difference between the fundamental properties of screened region-filling and colored region-filling. To resolve this obstacle, we propose a screentone variational autoencoder, ScreenVAE, to map the screened manga to an intermediate domain. This intermediate domain can summarize local texture characteristics and is interpolative. With this domain, we effectively unify the properties of screening and color-filling, and ease the learning for bidirectional translation between screened manga and color comics. To carry out the bidirectional translation, we further propose a network to learn the translation between the intermediate domain and color comics. Our model can generate quality screened manga given a color comic, and generate color comic that retains the original screening intention by the bitonal manga artist. Several results are shown to demonstrate the effectiveness and convenience of the proposed method. We also demonstrate how the intermediate domain can assist other applications such as manga inpainting and photo-to-comic conversion.

[1]  Paul F. Whelan,et al.  Using filter banks in Convolutional Neural Networks for texture classification , 2016, Pattern Recognit. Lett..

[2]  Anil K. Jain,et al.  Unsupervised texture segmentation using Gabor filters , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[3]  John F. Jarvis,et al.  A survey of techniques for the display of continuous tone pictures on bilevel displays , 1976 .

[4]  David Salesin,et al.  Computer-generated pen-and-ink illustration , 1994, SIGGRAPH.

[5]  Xianwen Yu,et al.  VAEGAN: A Collaborative Filtering Framework based on Adversarial Variational Autoencoders , 2019, IJCAI.

[6]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[8]  Iasonas Kokkinos,et al.  Deep Filter Banks for Texture Recognition, Description, and Segmentation , 2015, International Journal of Computer Vision.

[9]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jiaying Liu,et al.  Demystifying Neural Style Transfer , 2017, IJCAI.

[12]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[13]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[14]  Yusuke Matsui,et al.  Illustration2Vec: a semantic vector representation of illustrations , 2015, SIGGRAPH Asia Technical Briefs.

[15]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Kazuyuki Hiroshiba,et al.  Comicolorization: semi-automatic manga colorization , 2017, SIGGRAPH Asia Technical Briefs.

[18]  Tien-Tsin Wong,et al.  Two-stage sketch colorization , 2018, ACM Trans. Graph..

[19]  Jan P. Allebach,et al.  Digital halftoning , 2003 .

[20]  Tien-Tsin Wong,et al.  Manga colorization , 2006, ACM Trans. Graph..

[21]  Trygve Randen,et al.  Filtering for Texture Classification: A Comparative Study , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[23]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[25]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[29]  XuYi,et al.  Image smoothing via L0 gradient minimization , 2011 .

[30]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Daniel Cohen-Or,et al.  Structure-aware halftoning , 2008, ACM Trans. Graph..

[32]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[33]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[34]  Kiyoharu Aizawa,et al.  cGAN-Based Manga Colorization Using a Single Training Image , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[35]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[36]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[38]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[39]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[40]  Kiyoharu Aizawa,et al.  Sketch-based manga retrieval using manga109 dataset , 2015, Multimedia Tools and Applications.

[41]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Seunghoon Hong,et al.  Weakly Supervised Semantic Segmentation Using Superpixel Pooling Network , 2017, AAAI.

[45]  Xueting Liu,et al.  Deep extraction of manga structural lines , 2017, ACM Trans. Graph..

[46]  D. Wang,et al.  Image and Texture Segmentation Using Local Spectral Histograms , 2006, IEEE Transactions on Image Processing.