论文信息 - From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators - 字舞流文

From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators

We propose a new neural network architecture for solving single-image analogies - the generation of an entire set of stylistically similar images from just a single input image. Solving this problem requires separating image style from content. Our network is a modified variational autoencoder (VAE) that supports supervised training of single-image analogies and in-network evaluation of outputs with a structured similarity objective that captures pixel covariances. On the challenging task of generating a 62-letter font from a single example letter we produce images with 22.4% lower dissimilarity to the ground truth than state-of-the-art.

Noah Snavely | Kavita Bala | Paul Upchurch | K. Bala | Noah Snavely | P. Upchurch | Kavita Bala

[1] Yuting Zhang,et al. Deep Visual Analogy-Making , 2015, NIPS.

[2] Jan Kautz,et al. Learning a manifold of fonts , 2014, ACM Trans. Graph..

[3] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[4] Yoshua Bengio,et al. Unsupervised Models of Images by Spikeand-Slab RBMs , 2011, ICML.

[5] Ali Farhadi,et al. Visalogy: Answering Visual Analogy Questions , 2015, NIPS.

[6] Juha Karhunen,et al. Gated Boltzmann Machine in Texture Modeling , 2012, ICANN.

[7] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8] Leon A. Gatys,et al. A Neural Algorithm of Artistic Style , 2015, ArXiv.

[9] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[10] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[12] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.

[13] Zhou Wang,et al. On the Mathematical Properties of the Structural Similarity Index , 2012, IEEE Transactions on Image Processing.

[14] Robert W. Heath,et al. Design of Linear Equalizers Optimized for the Structural Similarity Index , 2008, IEEE Transactions on Image Processing.

[15] Other Contributors Are Indicated Where They Contribute. The FreeType Project , 2017 .

[16] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[17] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[18] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[19] David Salesin,et al. Image Analogies , 2001, SIGGRAPH.

[20] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[21] Robert W. Heath,et al. A Linear Estimator Optimized for the Structural Similarity Index and its Application to Image Denoising , 2006, 2006 International Conference on Image Processing.

[22] Yann LeCun,et al. Generalization and network design strategies , 1989 .

[23] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[24] Geoffrey E. Hinton,et al. Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[26] Alan C. Bovik,et al. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[27] Aaron Hertzmann,et al. Exploratory font selection using crowdsourced attributes , 2014, ACM Trans. Graph..

[28] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[29] Geoffrey E. Hinton,et al. Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.

[31] Geoffrey E. Hinton,et al. Generating more realistic images using gated MRF's , 2010, NIPS.

[32] Zhou Wang,et al. Stimulus synthesis for efficient evaluation and refinement of perceptual image quality metrics , 2004, IS&T/SPIE Electronic Imaging.

[33] Markus H. Gross,et al. Perceptually based downscaling of images , 2015, ACM Trans. Graph..