Structured Weight Priors for Convolutional Neural Networks

Selection of an architectural prior well suited to a task (e.g. convolutions for image data) is crucial to the success of deep neural networks (NNs). Conversely, the weight priors within these architectures are typically left vague, e.g.~independent Gaussian distributions, which has led to debate over the utility of Bayesian deep learning. This paper explores the benefits of adding structure to weight priors. It initially considers first-layer filters of a convolutional NN, designing a prior based on random Gabor filters. Second, it considers adding structure to the prior of final-layer weights by estimating how each hidden feature relates to each class. Empirical results suggest that these structured weight priors lead to more meaningful functional priors for image data. This contributes to the ongoing discussion on the importance of weight priors.

[1]  Guodong Zhang,et al.  Functional Variational Bayesian Neural Networks , 2019, ICLR.

[2]  Sebastian Nowozin,et al.  How Good is the Bayes Posterior in Deep Neural Networks Really? , 2020, ICML.

[3]  Daniel Flam-Shepherd Mapping Gaussian Process Priors to Bayesian Neural Networks , 2017 .

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[6]  Andrey Alekseev,et al.  GaborNet: Gabor filters with learnable parameters in deep convolutional neural network , 2019, 2019 International Conference on Engineering and Telecommunication (EnT).

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Tim Pearce,et al.  Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions , 2019, UAI.

[9]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10]  Andrew Gordon Wilson,et al.  Bayesian Deep Learning and a Probabilistic Perspective of Generalization , 2020, NeurIPS.

[11]  Christoph Palm,et al.  Gabor Filtering of Complex Hue/Saturation Images for Color Texture Classification , 2000 .

[12]  Hazim Kemal Ekenel,et al.  Initialization of convolutional neural networks by Gabor filters , 2018, 2018 26th Signal Processing and Communications Applications Conference (SIU).

[13]  Finale Doshi-Velez,et al.  Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks , 2019, ArXiv.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[17]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[18]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.