Pitfalls of the Gram Loss for Neural Texture Synthesis in Light of Deep Feature Histograms

Neural texture synthesis and style transfer are both powered by the Gram matrix as a means to measure deep feature statistics. Despite its ubiquity, this second-order feature descriptor has several shortcomings resulting in visual artifacts, ill-defined interpolation, or inability to capture spatial constraints. Many previous works acknowledge these shortcomings but do not really explain why they occur. Fixing them is thus usually approached by adding new losses, which require parameter tuning and make the problem even more ill-defined, or architecturing complex and/or adversarial networks. In this paper, we propose a comprehensive study of these problems in the light of the multi-dimensional histograms of deep features. With the insights gained from our analysis, we show how to compute a well-defined and efficient textural loss based on histogram transformations. Our textural loss outperforms the Gram matrix in terms of quality, robustness, spatial control, and interpolation. It does not require additional learning or parameter tuning, and can be implemented in a few lines of code.

[1]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[2]  Alex J. Champandard,et al.  Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks , 2016, ArXiv.

[3]  Gustavo K. Rohde,et al.  Sliced-Wasserstein Autoencoder: An Embarrassingly Simple Generative Model , 2018, ArXiv.

[4]  Yann Gousseau,et al.  Wasserstein Loss for Image Synthesis and Restoration , 2016, SIAM J. Imaging Sci..

[5]  Feng Xu,et al.  A Closed-Form Solution to Universal Style Transfer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Connelly Barnes,et al.  Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses , 2017, ArXiv.

[7]  Zhucun Xue,et al.  Texture Mixing by Interpolating Deep Statistics via Gaussian Models , 2018, IEEE Access.

[8]  Eli Shechtman,et al.  Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Ming-Hsuan Yang,et al.  Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Andrea Vedaldi,et al.  Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[12]  Daniel Cohen-Or,et al.  Deep Correlations for Texture Synthesis , 2017, ACM Trans. Graph..

[13]  Gregory Shakhnarovich,et al.  Style Transfer by Relaxed Optimal Transport and Self-Similarity , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Luc Van Gool,et al.  Sliced Wasserstein Generative Models , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Leon A. Gatys,et al.  Preserving Color in Neural Artistic Style Transfer , 2016, ArXiv.

[16]  Gang Liu,et al.  Texture synthesis through convolutional neural networks and spectrum constraints , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[17]  Alexander G. Schwing,et al.  Generative Modeling Using the Sliced Wasserstein Distance , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[19]  Peter Wonka,et al.  TileGAN , 2019, ACM Trans. Graph..

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Leon A. Gatys,et al.  Controlling Perceptual Factors in Neural Style Transfer , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jitendra Malik,et al.  Implicit Maximum Likelihood Estimation , 2018, ArXiv.

[23]  Xavier Snelgrove,et al.  High-resolution multi-scale neural texture synthesis , 2017, SIGGRAPH Asia Technical Briefs.

[24]  Ming-Hsuan Yang,et al.  Universal Style Transfer via Feature Transforms , 2017, NIPS.

[25]  Roland Vollgraf,et al.  Learning Texture Manifolds with the Periodic Spatial GAN , 2017, ICML.

[26]  Dani Lischinski,et al.  Non-stationary texture synthesis by adversarial expansion , 2018, ACM Trans. Graph..

[27]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[29]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[30]  Lihi Zelnik-Manor,et al.  The Contextual Loss for Image Transformation with Non-Aligned Data , 2018, ECCV.

[31]  A.C. Kokaram,et al.  N-dimensional probability density function transfer and its application to color transfer , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.