Pixel Recursive Super Resolution

Super resolution is the problem of artificially enlarging a low resolution photograph to recover a plausible high resolution version. In the regime of high magnification factors, the problem is dramatically underspecified and many plausible, high resolution images may match a given low resolution image. In particular, traditional super resolution techniques fail in this regime due to the multimodality of the problem and strong prior information that must be imposed on image synthesis to produce plausible high resolution images. In this work we propose a new probabilistic deep network architecture, a pixel recursive super resolution model, that is an extension of PixelCNNs to address this problem. We demonstrate that this model produces a diversity of plausible high resolution images at large magnification factors. Furthermore, in human evaluation studies we demonstrate how previous methods fail to fool human observers. However, high resolution images sampled from this probabilistic deep network do fool a naive human observer a significant fraction of the time.

[1]  Hsieh Hou,et al.  Cubic splines for image interpolation and digital filtering , 1978 .

[2]  K. Laughery,et al.  Sketch artist and Identi-kit procedures for recalling faces. , 1980 .

[3]  David Mumford,et al.  Statistics of natural images and models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[4]  William T. Freeman,et al.  Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[5]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[6]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[7]  Michael J. Black,et al.  Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[9]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[10]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[11]  Raanan Fattal,et al.  Image upsampling via imposed edge statistics , 2007, ACM Trans. Graph..

[12]  H. Shum,et al.  Image super-resolution using gradient profile prior , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Chi-Keung Tang,et al.  Fast image/video upsampling , 2008, SIGGRAPH Asia '08.

[14]  Kwang In Kim,et al.  Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[16]  Yair Weiss,et al.  From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[17]  Chih-Yuan Yang,et al.  Structured Face Hallucination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[19]  Thomas B. Moeslund,et al.  Super-resolution: a comprehensive survey , 2014, Machine Vision and Applications.

[20]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[22]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Subhasis Chaudhuri,et al.  Blind Image Deconvolution , 2014, Springer International Publishing.

[24]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[26]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[27]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[29]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[32]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[33]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[35]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[36]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[37]  Zhou Wang,et al.  Group MAD Competition? A New Methodology to Compare Objective Image Quality Models , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Xin Yu,et al.  Ultra-Resolving Face Images by Discriminative Generative Networks , 2016, ECCV.

[40]  Joan Bruna,et al.  Super-Resolution with Deep Convolutional Sufficient Statistics , 2015, ICLR.

[41]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[42]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[43]  Aditya Deshpande,et al.  Learning Diverse Image Colorization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Peyman Milanfar,et al.  RAISR: Rapid and Accurate Image Super Resolution , 2016, IEEE Transactions on Computational Imaging.

[45]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[46]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[47]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[48]  Thomas S. Huang,et al.  Fast Generation for Convolutional Autoregressive Models , 2017, ICLR.

[49]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).