A Bayesian Perspective on the Deep Image Prior

The deep image prior was recently introduced as a prior for natural images. It represents images as the output of a convolutional network with random inputs. For “inference”, gradient descent is performed to adjust network parameters to make the output match observations. This approach yields good performance on a range of image reconstruction tasks. We show that the deep image prior is asymptotically equivalent to a stationary Gaussian process prior in the limit as the number of channels in each layer of the network goes to infinity, and derive the corresponding kernel. This informs a Bayesian approach to inference. We show that by conducting posterior inference using stochastic gradient Langevin dynamics we avoid the need for early stopping, which is a drawback of the current approach, and improve results for denoising and impainting tasks. We illustrate these intuitions on a number of 1D and 2D signal reconstruction tasks.

[1]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[2]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[3]  Gordon Wetzstein,et al.  Fast and flexible convolutional sparse coding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Julien Cornebise,et al.  Weight Uncertainty in Neural Network , 2015, ICML.

[5]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[6]  Jaehoon Lee,et al.  Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.

[7]  Yann LeCun,et al.  Transforming Neural-Net Output Levels to Probability Distributions , 1990, NIPS.

[8]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[9]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[10]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[11]  C.W. Therrien Issues in multirate statistical signal processing , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[12]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[13]  Michael Elad,et al.  Convolutional Dictionary Learning via Local Processing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  A. Wald,et al.  On Stochastic Limit and Order Relationships , 1943 .

[15]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[16]  A. Baudes,et al.  A Nonlocal Algorithm for Image Denoising , 2005, CVPR 2005.

[17]  G. Matheron The intrinsic random functions and their applications , 1973, Advances in Applied Probability.

[18]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[19]  Charles M. Bishop,et al.  Ensemble learning in Bayesian neural networks , 1998 .

[20]  Laurence Aitchison,et al.  Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.

[21]  Anastasia Borovykh,et al.  A Gaussian Process perspective on Convolutional Neural Networks , 2018, ArXiv.

[22]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[23]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[24]  Lawrence Carin,et al.  Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks , 2015, AAAI.

[25]  Leon A. Gatys,et al.  Texture Synthesis Using Shallow Convolutional Networks with Random Filters , 2016, ArXiv.

[26]  Christopher K. I. Williams Computing with Infinite Networks , 1996, NIPS.

[27]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[28]  Zhenghao Chen,et al.  On Random Weights and Unsupervised Feature Learning , 2011, ICML.

[29]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..