SDP-GAN: Saliency Detail Preservation Generative Adversarial Networks for High Perceptual Quality Style Transfer

The paper proposes a solution to effectively handle salient regions for style transfer between unpaired datasets. Recently, Generative Adversarial Networks (GAN) have demonstrated their potentials of translating images from source domain ${X}$ to target domain ${Y}$ in the absence of paired examples. However, such a translation cannot guarantee to generate high perceptual quality results. Existing style transfer methods work well with relatively uniform content, they often fail to capture geometric or structural patterns that always belong to salient regions. Detail losses in structured regions and undesired artifacts in smooth regions are unavoidable even if each individual region is correctly transferred into the target style. In this paper, we propose SDP-GAN, a GAN-based network for solving such problems while generating enjoyable style transfer results. We introduce a saliency network, which is trained with the generator simultaneously. The saliency network has two functions: (1) providing constraints for content loss to increase punishment for salient regions, and (2) supplying saliency features to generator to produce coherent results. Moreover, two novel losses are proposed to optimize the generator and saliency networks. The proposed method preserves the details on important salient regions and improves the total image perceptual quality. Qualitative and quantitative comparisons against several leading prior methods demonstrates the superiority of our method.

[1]  Hui Jiang,et al.  Supervised adversarial networks for image saliency detection , 2020, International Conference on Graphic and Image Processing.

[2]  Tat-Seng Chua,et al.  Laplacian-Steered Neural Style Transfer , 2017, ACM Multimedia.

[3]  Zhi Liu,et al.  Saliency-Guided Image Style Transfer , 2019, 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[4]  Bruce Gooch,et al.  Non-photorealistic rendering , 2001 .

[5]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[7]  Sridha Sridharan,et al.  Task Specific Visual Saliency Prediction with Memory Augmented Conditional Generative Adversarial Networks , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[8]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yong-Jin Liu,et al.  CartoonGAN: Generative Adversarial Networks for Photo Cartoonization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Connelly Barnes,et al.  Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses , 2017, ArXiv.

[11]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[12]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[14]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[15]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[20]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[21]  Radomír Mech,et al.  Minimum Barrier Salient Object Detection at 80 FPS , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[23]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Nenghai Yu,et al.  StyleBank: An Explicit Representation for Neural Image Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xiaogang Wang,et al.  Avatar-Net: Multi-scale Zero-Shot Style Transfer by Feature Decoration , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Xiaokang Yang,et al.  Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer , 2019, IEEE Transactions on Image Processing.

[28]  Yu-Kun Lai,et al.  Depth-aware neural style transfer , 2017, NPAR '17.

[29]  Leon A. Gatys,et al.  Controlling Perceptual Factors in Neural Style Transfer , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xiaoning Qian,et al.  Image Co-Saliency Detection and Co-Segmentation via Progressive Joint Optimization , 2019, IEEE Transactions on Image Processing.

[31]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Chin-Chen Chang,et al.  Image Neural Style Transfer With Preserving the Salient Regions , 2019, IEEE Access.

[33]  Jiaying Liu,et al.  Demystifying Neural Style Transfer , 2017, IJCAI.

[34]  Jonathon Shlens,et al.  A Learned Representation For Artistic Style , 2016, ICLR.

[35]  Jing Zhang,et al.  Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Francesca Murabito,et al.  Top-Down Saliency Detection Driven by Visual Classification , 2017, Comput. Vis. Image Underst..

[37]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[38]  Paul L. Rosin,et al.  Image and Video-Based Artistic Stylisation , 2012, Computational Imaging and Vision.

[39]  Ruigang Yang,et al.  Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[41]  Paul L. Rosin,et al.  Structure-Preserving Neural Style Transfer , 2020, IEEE Transactions on Image Processing.

[42]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[43]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[44]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Kwang Hee Lee,et al.  Arbitrary Style Transfer With Style-Attentional Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[50]  Ming-Hsuan Yang,et al.  Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Ming-Hsuan Yang,et al.  Universal Style Transfer via Feature Transforms , 2017, NIPS.

[52]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[53]  Luc Van Gool,et al.  SMIT: Stochastic Multi-Label Image-to-Image Translation , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[54]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[55]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[56]  Chiou-Ting Hsu,et al.  Towards Deep Style Transfer: A Content-Aware Perspective , 2016, BMVC.

[57]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.