论文信息 - Benefiting from Multitask Learning to Improve Single Image Super-Resolution

Benefiting from Multitask Learning to Improve Single Image Super-Resolution

Despite significant progress toward super resolving more realistic images by deeper convolutional neural networks (CNNs), reconstructing fine and natural textures still remains a challenging problem. Recent works on single image super resolution (SISR) are mostly based on optimizing pixel and content wise similarity between recovered and high-resolution (HR) images and do not benefit from recognizability of semantic classes. In this paper, we introduce a novel approach using categorical information to tackle the SISR problem; we present a decoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation. To explore categorical information during training, the proposed decoder only employs one shared deep network for two task-specific output layers. At run-time only layers resulting HR image are used and no segmentation label is required. Extensive perceptual experiments and a user study on images randomly selected from COCO-Stuff dataset demonstrate the effectiveness of our proposed method and it outperforms the state-of-the-art methods.

[1] Yun Fu,et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[2] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3] Tsuyoshi Murata,et al. {m , 1934, ACML.

[4] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[5] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Kyoung Mu Lee,et al. Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] C. Duchon. Lanczos Filtering in One and Two Dimensions , 1979 .

[8] Michael Kampffmeyer,et al. Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9] Chao Dong,et al. Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Kyoung Mu Lee,et al. Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Ruigang Yang,et al. Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[13] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[14] Guangyu Sun,et al. SRPGAN: Perceptual Generative Adversarial Network for Single Image Super Resolution , 2017, ArXiv.

[15] Narendra Ahuja,et al. Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Joan Bruna,et al. Super-Resolution with Deep Convolutional Sufficient Statistics , 2015, ICLR.

[17] Jitendra Malik,et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18] Pong C. Yuen,et al. Very low resolution face recognition problem , 2010, 2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[19] Chih-Yuan Yang,et al. Exploiting Self-similarities for Single Frame Super-Resolution , 2010, ACCV.

[20] Xiaochun Cao,et al. Video Deblurring via Semantic Segmentation and Pixel-Wise Non-linear Kernel , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Lihi Zelnik-Manor,et al. Maintaining Natural Image Statistics with the Contextual Loss , 2018, ACCV.

[23] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[25] Daniel Rueckert,et al. Cardiac Image Super-Resolution with Global Correspondence Using Multi-Atlas PatchMatch , 2013, MICCAI.

[26] Michael Elad,et al. On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[27] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Vittorio Ferrari,et al. COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Thomas S. Huang,et al. Coupled Dictionary Training for Image Super-Resolution , 2012, IEEE Transactions on Image Processing.

[30] Jean-Philippe Thiran,et al. Efficient Active Learning for Image Classification and Segmentation using a Sample Selection and Conditional Generative Adversarial Network , 2018, MICCAI.

[31] Hoi-Jun Yoo,et al. A high-throughput 16× super resolution processor for real-time object recognition SoC , 2013, 2013 Proceedings of the ESSCIRC (ESSCIRC).

[32] Aline Roumy,et al. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[33] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34] Jean-Philippe Thiran,et al. Learn to synthesize and synthesize to learn , 2019, Comput. Vis. Image Underst..

[35] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Xiaoou Tang,et al. Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[37] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[38] Jean-Philippe Thiran,et al. Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[39] Lihi Zelnik-Manor,et al. Learning to Maintain Natural Image Statistics , 2018, ArXiv.

[40] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.

[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43] Thomas Brox,et al. Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[44] Bernhard Schölkopf,et al. EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).