Automatic semantic style transfer using deep convolutional neural networks and soft masks

This paper presents an automatic image synthesis method to transfer the style of an example image to a content image. When standard neural style transfer approaches are used, the textures and colours in different semantic regions of the style image are often applied inappropriately to the content image, ignoring its semantic layout and ruining the transfer result. In order to reduce or avoid such effects, we propose a novel method based on automatically segmenting the objects and extracting their soft semantic masks from the style and content images, in order to preserve the structure of the content image while having the style transferred. Each soft mask of the style image represents a specific part of the style image, corresponding to the soft mask of the content image with the same semantics. Both the soft masks and source images are provided as multichannel input to an augmented deep CNN framework for style transfer which incorporates a generative Markov random field model. The results on various images show that our method outperforms the most recent techniques.

[1]  Ming-Hsuan Yang,et al.  Universal Style Transfer via Feature Transforms , 2017, NIPS.

[2]  Hailin Jin,et al.  Disentangling Structure and Aesthetics for Style-Aware Image Completion , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Matti Pietikäinen,et al.  Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.

[4]  Eli Shechtman,et al.  Example-based synthesis of stylized facial animations , 2017, ACM Trans. Graph..

[5]  Byungsoo Kim,et al.  Transport-based neural style transfer for smoke simulations , 2019, ACM Trans. Graph..

[6]  Frédo Durand,et al.  Style transfer for headshot portraits , 2014, ACM Trans. Graph..

[7]  LeeSung-Hee,et al.  Multi-Contact Locomotion Using a Contact Graph with Feasibility Predictors , 2017 .

[8]  Marc Levoy,et al.  Fast texture synthesis using tree-structured vector quantization , 2000, SIGGRAPH.

[9]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Martin Thoma,et al.  A Survey of Semantic Segmentation , 2016, ArXiv.

[11]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[12]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[13]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[14]  Alex J. Champandard,et al.  Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks , 2016, ArXiv.

[15]  Chen Cao,et al.  Style Transfer Via Image Component Analysis , 2013, IEEE Transactions on Multimedia.

[16]  Giuseppe De Pietro,et al.  Human skin detection through correlation rules between the YCb and YCr subspaces based on dynamic color clustering , 2017, Comput. Vis. Image Underst..

[17]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Adam Finkelstein,et al.  PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Trevor Darrell,et al.  Multi-content GAN for Few-Shot Font Style Transfer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Lihua You,et al.  Semantic portrait color transfer with internet images , 2015, Multimedia Tools and Applications.

[21]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Gang Hua,et al.  Visual attribute transfer through deep image analogy , 2017, ACM Trans. Graph..

[23]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Guang-Zhong Yang,et al.  pq-space Based Non-Photorealistic Rendering for Augmented Reality , 2007, MICCAI.

[25]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[26]  Oliver Deussen,et al.  Watercolor Illustrations of CAD Data , 2008, CAe.

[27]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Hongping Cai,et al.  Cross-depiction problem: Recognition and synthesis of photographs and artwork , 2015, Computational Visual Media.

[29]  Tobias Isenberg,et al.  Visual Abstraction and Stylisation of Maps , 2013 .

[30]  Chuan Li,et al.  Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Alexei A. Efros,et al.  Texture synthesis by non-parametric sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[32]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[33]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  Leon A. Gatys,et al.  Controlling Perceptual Factors in Neural Style Transfer , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Nipun Kwatra,et al.  Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[37]  Yi Yang,et al.  Style Aggregated Network for Facial Landmark Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Vladimir Kolmogorov,et al.  Object cosegmentation , 2011, CVPR 2011.

[39]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[40]  Neus Sabater,et al.  Split and Match: Example-Based Adaptive Patch Sampling for Unsupervised Style Transfer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Peter Robinson,et al.  OpenFace: An open source facial behavior analysis toolkit , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[42]  Hang Zhang,et al.  Multi-style Generative Network for Real-time Transfer , 2017, ECCV Workshops.

[43]  Irfan A. Essa,et al.  Graphcut textures: image and video synthesis using graph cuts , 2003, ACM Trans. Graph..

[44]  Thomas Brox,et al.  Artistic Style Transfer for Videos , 2016, GCPR.

[45]  Linda Doyle,et al.  Painting style transfer for head portraits using convolutional neural networks , 2016, ACM Trans. Graph..

[46]  Elaine Cohen,et al.  A non-photorealistic lighting model for automatic technical illustration , 1998, SIGGRAPH.

[47]  Andrea Vedaldi,et al.  Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[49]  Xin Deng Enhancing Image Quality via Style Transfer for Single Image Super-Resolution , 2018, IEEE Signal Processing Letters.