The Neural Tangent Link Between CNN Denoisers and Non-Local Filters

Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. Modern CNN-based algorithms obtain state-of-the-art performance in diverse image restoration problems. Furthermore, it has been recently shown that, despite being highly overparameterized, networks trained with a single corrupted image can still perform as well as fully trained networks. We introduce a formal link between such networks through their neural tangent kernel (NTK), and well-known non-local filtering techniques, such as non-local means or BM3D. The filtering function associated with a given network architecture can be obtained in closed form without need to train the network, being fully characterized by the random initialization of the network weights. While the NTK theory accurately predicts the filter associated with networks trained using standard gradient descent, our analysis shows that it falls short to explain the behaviour of networks trained using the popular Adam optimizer. The latter achieves a larger change of weights in hidden layers, adapting the non-local filtering function during training. We evaluate our findings via extensive image denoising experiments 1.

[1]  俊一 甘利 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .

[2]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[3]  Nathan Srebro,et al.  The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.

[4]  Jaehoon Lee,et al.  Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[5]  Huan Wang,et al.  Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width , 2020, ArXiv.

[6]  Brendt Wohlberg,et al.  Plug-and-Play priors for model based reconstruction , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[7]  Peyman Milanfar,et al.  Kernel Regression for Image Processing and Reconstruction , 2007, IEEE Transactions on Image Processing.

[8]  Greg Yang,et al.  Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.

[9]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[10]  Shun-ichi Amari,et al.  Universal statistics of Fisher information in deep neural networks: mean field approach , 2018, AISTATS.

[11]  Zahra Kadkhodaie,et al.  Robust and interpretable blind image denoising via bias-free convolutional neural networks , 2019, ICLR.

[12]  Michael Elad,et al.  LIDIA: Lightweight Learned Image Denoising with Instance Adaptation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[14]  Thomas S. Huang,et al.  Non-Local Recurrent Network for Image Restoration , 2018, NeurIPS.

[15]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Reinhard Heckel,et al.  Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators , 2020, ICLR.

[17]  Morteza Mardani,et al.  Neural Proximal Gradient Descent for Compressive Imaging , 2018, NeurIPS.

[18]  Nathan Srebro,et al.  Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.

[19]  Philipp Hennig,et al.  Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients , 2017, ICML.

[20]  Jaehoon Lee,et al.  Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.

[21]  Peyman Milanfar,et al.  A Tour of Modern Image Filtering: New Insights and Methods, Both Practical and Theoretical , 2013, IEEE Signal Processing Magazine.

[22]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[23]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[24]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Florian Jug,et al.  Noise2Void - Learning Denoising From Single Noisy Images , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Thierry Blu,et al.  Monte-Carlo Sure: A Black-Box Optimization of Regularization Parameters for General Denoising Algorithms , 2008, IEEE Transactions on Image Processing.

[27]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  George Loizou,et al.  Computer vision and pattern recognition , 2007, Int. J. Comput. Math..

[29]  Peyman Milanfar,et al.  Global Image Denoising , 2014, IEEE Transactions on Image Processing.

[30]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[31]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[32]  Brendt Wohlberg,et al.  Efficient Algorithms for Convolutional Sparse Representations , 2016, IEEE Transactions on Image Processing.

[33]  Jascha Sohl-Dickstein,et al.  Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks , 2018, ICML.

[34]  Ruosong Wang,et al.  On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.

[35]  Ulugbek S. Kamilov,et al.  RARE: Image Reconstruction Using Deep Priors Learned Without Groundtruth , 2019, IEEE Journal of Selected Topics in Signal Processing.

[36]  Jaakko Lehtinen,et al.  Noise2Noise: Learning Image Restoration without Clean Data , 2018, ICML.

[37]  Loïc Royer,et al.  Noise2Self: Blind Denoising by Self-Supervision , 2019, ICML.

[38]  Michael Elad,et al.  The Little Engine That Could: Regularization by Denoising (RED) , 2016, SIAM J. Imaging Sci..

[39]  Subhransu Maji,et al.  A Bayesian Perspective on the Deep Image Prior , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.