Towards Metamerism via Foveated Style Transfer

The problem of $\textit{visual metamerism}$ is defined as finding a family of perceptually indistinguishable, yet physically different images. In this paper, we propose our NeuroFovea metamer model, a foveated generative model that is based on a mixture of peripheral representations and style transfer forward-pass algorithms. Our gradient-descent free model is parametrized by a foveated VGG19 encoder-decoder which allows us to encode images in high dimensional space and interpolate between the content and texture information with adaptive instance normalization anywhere in the visual field. Our contributions include: 1) A framework for computing metamers that resembles a noisy communication system via a foveated feed-forward encoder-decoder network -- We observe that metamerism arises as a byproduct of noisy perturbations that partially lie in the perceptual null space; 2) A perceptual optimization scheme as a solution to the hyperparametric nature of our metamer model that requires tuning of the image-texture tradeoff coefficients everywhere in the visual field which are a consequence of internal noise; 3) An ABX psychophysical evaluation of our metamers where we also find that the rate of growth of the receptive fields in our model match V1 for reference metamers and V2 between synthesized samples. Our model also renders metamers at roughly a second, presenting a $\times1000$ speed-up compared to the previous work, which allows for tractable data-driven metamer experiments.

[1]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[2]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[3]  Ming-Hsuan Yang,et al.  Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  R. Rosenholtz,et al.  Pooling of continuous features provides a unifying account of crowding , 2016, Journal of vision.

[6]  Shenmin Zhang,et al.  What do saliency models predict? , 2014, Journal of vision.

[7]  Talia Konkle,et al.  Mid-level visual features underlie the high-level categorical organization of the ventral stream , 2018, Proceedings of the National Academy of Sciences.

[8]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[10]  Lester C. Loschky,et al.  The contributions of central versus peripheral vision to scene gist recognition. , 2009, Journal of vision.

[11]  Bryan Reimer,et al.  SideEye: A Generative Neural Network Based Simulator of Human Peripheral Vision , 2017, 1706.04568.

[12]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Valero Laparra,et al.  Perceptually Optimized Image Rendering , 2017, Journal of the Optical Society of America. A, Optics, image science, and vision.

[14]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[15]  Frédo Durand,et al.  Where Should Saliency Models Look Next? , 2016, ECCV.

[16]  Leon A. Gatys,et al.  Towards matching peripheral appearance for arbitrary natural images using deep features , 2017 .

[17]  Valero Laparra,et al.  Eigen-Distortions of Hierarchical Representations , 2017, NIPS.

[18]  J. Lubin A human vision system model for objective picture quality measurements , 1997 .

[19]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[20]  Matthias Bethge,et al.  Testing models of peripheral encoding using metamerism in an oddity paradigm. , 2016, Journal of vision.

[21]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  R. Rosenholtz Capabilities and Limitations of Peripheral Vision. , 2016, Annual review of vision science.

[23]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[25]  Miguel P. Eckstein,et al.  Object detection through search with a foveated visual system , 2014, PLoS Comput. Biol..

[26]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[27]  Eero P. Simoncelli,et al.  Geodesics of learned representations , 2015, ICLR.

[28]  Eduardo Valle,et al.  Exploring the space of adversarial images , 2015, 2016 International Joint Conference on Neural Networks (IJCNN).

[29]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[30]  Eero P. Simoncelli,et al.  A functional and perceptual signature of the second visual area in primates , 2013, Nature Neuroscience.

[31]  Eero P. Simoncelli,et al.  Selectivity and tolerance for visual texture in macaque V2 , 2016, Proceedings of the National Academy of Sciences.

[32]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[33]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[34]  Leon A. Gatys,et al.  What does it take to generate natural textures? , 2017, ICLR.

[35]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[36]  Matthias Bethge,et al.  Of Human Observers and Deep Neural Networks: A Detailed Psychophysical Comparison , 2017 .

[37]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  R. Rosenholtz,et al.  A summary-statistic representation in peripheral vision explains visual crowding. , 2009, Journal of vision.

[39]  Zhou Wang,et al.  Information Content Weighting for Perceptual Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[40]  Dimitris Samaras,et al.  Modeling visual clutter perception using proto-object segmentation. , 2014, Journal of vision.

[41]  O. Reiser,et al.  Principles Of Gestalt Psychology , 1936 .

[42]  Leon A. Gatys,et al.  A parametric texture model based on deep convolutional features closely matches texture appearance for humans , 2017, bioRxiv.

[43]  F A Wichmann,et al.  Ning for Helpful Comments and Suggestions. This Paper Benefited Con- Siderably from Conscientious Peer Review, and We Thank Our Reviewers the Psychometric Function: I. Fitting, Sampling, and Goodness of Fit , 2001 .

[44]  Gregory J. Zelinsky,et al.  Modeling Clutter Perception using Parametric Proto-object Partitioning , 2013, NIPS.

[45]  R. Rosenholtz,et al.  A summary statistic representation in peripheral vision explains visual search. , 2009, Journal of vision.

[46]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[47]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[48]  Miguel P. Eckstein,et al.  Can Peripheral Representations Improve Clutter Metrics on Complex Scenes? , 2016, NIPS.

[49]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[50]  Yuanzhen Li,et al.  Feature congestion: a measure of display clutter , 2005, CHI.

[51]  Scott J. Daly,et al.  Visible differences predictor: an algorithm for the assessment of image fidelity , 1992, Electronic Imaging.

[52]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[53]  Leon A. Gatys,et al.  Controlling Perceptual Factors in Neural Style Transfer , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Michael L. Mack,et al.  Identifying the Perceptual Dimensions of Visual Complexity of Scenes , 2004 .

[55]  Miguel P. Eckstein,et al.  Object detection through search with a foveated visual system , 2014, PLoS Comput. Biol..