Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform

Despite that convolutional neural networks (CNN) have recently demonstrated high-quality reconstruction for single-image super-resolution (SR), recovering natural and realistic texture remains a challenging problem. In this paper, we show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps. This is made possible through a novel Spatial Feature Transform (SFT) layer that generates affine transformation parameters for spatial-wise feature modulation. SFT layers can be trained end-to-end together with the SR network using the same loss function. During testing, it accepts an input image of arbitrary size and generates a high-resolution image with just a single forward pass conditioned on the categorical priors. Our final results show that an SR network equipped with SFT can generate more realistic and visually pleasing textures in comparison to state-of-the-art SRGAN [27] and EnhanceNet [38].

[1]  Jiejie Zhu,et al.  Context-constrained hallucination for image super-resolution , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jian Yang,et al.  MemNet: A Persistent Memory Network for Image Restoration , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Kyoung Mu Lee,et al.  Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Aline Roumy,et al.  Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[6]  Eric Dubois,et al.  Image up-sampling using total-variation regularization with a new observation model , 2005, IEEE Transactions on Image Processing.

[7]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[9]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[11]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[12]  Ming-Hsuan Yang,et al.  Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Hong Chang,et al.  Super-resolution through neighbor embedding , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Raanan Fattal,et al.  Image upsampling via imposed edge statistics , 2007, ACM Trans. Graph..

[15]  Luc Van Gool,et al.  Semantic super-resolution: When and where is it useful? , 2016, Comput. Vis. Image Underst..

[16]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[17]  Xiaoxiao Li,et al.  Deep Learning Markov Random Field for Semantic Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[19]  Jonathon Shlens,et al.  A Learned Representation For Artistic Style , 2016, ICLR.

[20]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[21]  Xiaoxiao Li,et al.  Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[24]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  C. Duchon Lanczos Filtering in One and Two Dimensions , 1979 .

[26]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[29]  Kwang In Kim,et al.  Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Bernhard Schölkopf,et al.  EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Joan Bruna,et al.  Super-Resolution with Deep Convolutional Sufficient Statistics , 2015, ICLR.

[33]  Leon A. Gatys,et al.  Controlling Perceptual Factors in Neural Style Transfer , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Luc Van Gool,et al.  A+: Adjusted Anchored Neighborhood Regression for Fast Super-Resolution , 2014, ACCV.

[35]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Xiaochun Cao,et al.  Video Deblurring via Semantic Segmentation and Pixel-Wise Non-linear Kernel , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Thomas S. Huang,et al.  Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Aaron C. Courville,et al.  Learning Visual Reasoning Without Strong Priors , 2017, ICML 2017.

[41]  Deqing Sun,et al.  Learning to Super-Resolve Blurry Face and Text Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Sanja Fidler,et al.  Be Your Own Prada: Fashion Synthesis with Structural Coherence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Lei Zhang,et al.  Image Deblurring and Super-Resolution by Adaptive Sparse Domain Selection and Adaptive Regularization , 2010, IEEE Transactions on Image Processing.

[44]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[45]  Xiaoxiao Li,et al.  Semantic Image Segmentation via Deep Parsing Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[48]  Horst Bischof,et al.  Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[50]  Honglak Lee,et al.  Exploring the structure of a real-time, arbitrary neural artistic stylization network , 2017, BMVC.

[51]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[52]  Luc Van Gool,et al.  Anchored Neighborhood Regression for Fast Example-Based Super-Resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[53]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[54]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Michal Irani,et al.  Super-resolution from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[56]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[57]  Eirikur Agustsson,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).