论文信息 - Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.

[1] Michal Irani,et al. Improving resolution by image registration , 1991, CVGIP Graph. Model. Image Process..

[2] William T. Freeman,et al. Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[3] Nanning Zheng,et al. Image hallucination with primal sketch priors , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[4] D. Yeung,et al. Super-resolution through neighbor embedding , 2004, CVPR 2004.

[5] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[6] Eric Dubois,et al. Image up-sampling using total-variation regularization with a new observation model , 2005, IEEE Transactions on Image Processing.

[7] Alan C. Bovik,et al. A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[8] Alan C. Bovik,et al. Image information and visual quality , 2006, IEEE Trans. Image Process..

[9] Truong Q. Nguyen,et al. Image Superresolution Using Support Vector Regression , 2007, IEEE Transactions on Image Processing.

[10] H. Shum,et al. Image super-resolution using gradient profile prior , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Chi-Keung Tang,et al. Fast image/video upsampling , 2008, SIGGRAPH 2008.

[12] Thomas S. Huang,et al. Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Mohammed Ghanbari,et al. Scope of validity of PSNR in image/video quality assessment , 2008 .

[14] Alan C. Bovik,et al. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[15] Michal Irani,et al. Super-resolution from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16] Zhiwei Xiong,et al. Robust Web Image/Video Super-Resolution , 2010, IEEE Transactions on Image Processing.

[17] Thomas S. Huang,et al. Non-Local Kernel Regression for Image and Video Restoration , 2010, ECCV.

[18] Kwang In Kim,et al. Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Michael Elad,et al. On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[20] Thomas S. Huang,et al. Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[21] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[22] David Zhang,et al. FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[23] Raanan Fattal,et al. Image and video upscaling from local self-examples , 2011, TOGS.

[24] Pierre Vandergheynst,et al. Beyond bits: Reconstructing images from Local Binary Descriptors , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[25] Aline Roumy,et al. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[26] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Zhe L. Lin,et al. Fast Image Super-Resolution Based on In-Place Example Regression , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Touradj Ebrahimi,et al. Benchmarking of quality metrics on ultra-high definition video sequences , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[29] Ronan Collobert,et al. Recurrent Convolutional Neural Networks for Scene Parsing , 2013, ArXiv.

[30] Antonio Torralba,et al. HOGgles: Visualizing Object Detection Features , 2013, 2013 IEEE International Conference on Computer Vision.

[31] Russell Zaretzki,et al. Beta Process Joint Dictionary Learning for Coupled Feature Spaces with Application to Single Image Super-Resolution , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Xiaoou Tang,et al. Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[33] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[34] Ronan Collobert,et al. Recurrent Convolutional Neural Networks for Scene Labeling , 2014, ICML.

[35] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[36] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[37] Chih-Yuan Yang,et al. Single-Image Super-Resolution: A Benchmark , 2014, ECCV.

[38] Luc Van Gool,et al. A+: Adjusted Anchored Neighborhood Regression for Fast Super-Resolution , 2014, ACCV.

[39] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[40] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[41] Alexandre Alahi,et al. From Bits to Images: Inversion of Local Binary Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42] Makoto Takizawa,et al. Future Data and Security Engineering , 2014, Lecture Notes in Computer Science.

[43] Leon A. Gatys,et al. Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[44] Brian L. Evans,et al. Full-reference visual quality assessment for synthetic images: A subjective study , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[45] Hod Lipson,et al. Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[46] Narendra Ahuja,et al. Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Guosheng Lin,et al. Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[50] Andrea Vedaldi,et al. Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Abhinav Gupta,et al. Designing deep networks for surface normal estimation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[53] Bin Sheng,et al. Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54] Horst Bischof,et al. Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Trevor Darrell,et al. Fully convolutional networks for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[57] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58] 한보형,et al. Learning Deconvolution Network for Semantic Segmentation , 2015 .

[59] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[60] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[61] Leon A. Gatys,et al. A Neural Algorithm of Artistic Style , 2015, ArXiv.

[62] Chuan Li,et al. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[63] Thomas Brox,et al. Inverting Visual Representations with Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[68] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[69] Andrea Vedaldi,et al. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.