DBSN: Measuring Uncertainty through Bayesian Learning of Deep Neural Network Structures

Bayesian neural networks (BNNs) introduce uncertainty estimation to deep networks by performing Bayesian inference on network weights. However, such models bring the challenges of inference, and further BNNs with weight uncertainty rarely achieve superior performance to standard models. In this paper, we investigate a new line of Bayesian deep learning by performing Bayesian reasoning on the structure of deep neural networks. Drawing inspiration from the neural architecture search, we define the network structure as gating weights on the redundant operations between computational nodes, and apply stochastic variational inference techniques to learn the structure distributions of networks. Empirically, the proposed method substantially surpasses the advanced deep neural networks across a range of classification and segmentation tasks. More importantly, our approach also preserves benefits of Bayesian principles, producing improved uncertainty estimation than the strong baselines including MC dropout and variational BNNs algorithms (e.g. noisy EK-FAC).

[1]  Ning Chen,et al.  Message Passing Stein Variational Gradient Descent , 2017, ICML.

[2]  Thomas L. Griffiths,et al.  A Non-Parametric Bayesian Method for Inferring Hidden Causes , 2006, UAI.

[3]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[4]  Masashi Sugiyama,et al.  Bayesian Dark Knowledge , 2015 .

[5]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[7]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[8]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[10]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[11]  Guodong Zhang,et al.  Eigenvalue Corrected Noisy Natural Gradient , 2018, ArXiv.

[12]  Vivek Rathod,et al.  Bayesian dark knowledge , 2015, NIPS.

[13]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[14]  Finale Doshi-Velez,et al.  Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors , 2018, ICML.

[15]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[16]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[17]  Dit-Yan Yeung,et al.  Towards Bayesian Deep Learning: A Framework and Some Existing Methods , 2016, IEEE Transactions on Knowledge and Data Engineering.

[18]  Siegfried Wahl,et al.  Leveraging uncertainty information from deep neural networks for disease detection , 2016, Scientific Reports.

[19]  Finale Doshi-Velez,et al.  Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks , 2016, ICLR.

[20]  Julien Cornebise,et al.  Weight Uncertainty in Neural Network , 2015, ICML.

[21]  Pat Langley,et al.  Crafting Papers on Machine Learning , 2000, ICML.

[22]  Alex Kendall,et al.  Concrete Dropout , 2017, NIPS.

[23]  Max Welling,et al.  Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[24]  Bo Zhang,et al.  Function Space Particle Optimization for Bayesian Neural Networks , 2019, ICLR.

[25]  Le Song,et al.  Meta Architecture Search , 2019, NeurIPS.

[26]  David Barber,et al.  Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[27]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[28]  David M. Blei,et al.  Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..

[29]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[30]  Guodong Zhang,et al.  Noisy Natural Gradient as Variational Inference , 2017, ICML.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[33]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[34]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[35]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[36]  Justin Bayer,et al.  Bayesian Learning of Neural Network Architectures , 2019, AISTATS.

[37]  Kevin Smith,et al.  Bayesian Uncertainty Estimation for Batch Normalized Deep Networks , 2018, ICML.

[38]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[39]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[40]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[41]  Sebastian Nowozin,et al.  How Good is the Bayes Posterior in Deep Neural Networks Really? , 2020, ICML.

[42]  Ben Glocker,et al.  Implicit Weight Uncertainty in Neural Networks. , 2017 .

[43]  Roger B. Grosse,et al.  Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions , 2019, ICLR.

[44]  Didrik Nielsen,et al.  Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam , 2018, ICML.

[45]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[46]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[47]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[48]  Lawrence Carin,et al.  Learning Structured Weight Uncertainty in Bayesian Neural Networks , 2017, AISTATS.

[49]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[50]  Yujie Li,et al.  NAS-Unet: Neural Architecture Search for Medical Image Segmentation , 2019, IEEE Access.

[51]  Jun Zhu,et al.  Kernel Implicit Variational Inference , 2017, ICLR.

[52]  Mohammad Emtiyaz Khan,et al.  Practical Deep Learning with Bayesian Principles , 2019, NeurIPS.

[53]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[54]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[56]  Guodong Zhang,et al.  Functional Variational Bayesian Neural Networks , 2019, ICLR.

[57]  Yoshua Bengio,et al.  The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[58]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling , 2015, CVPR 2015.

[60]  Tim Pearce,et al.  Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions , 2019, UAI.

[61]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[63]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[64]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[65]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[66]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[67]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[68]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[69]  Matthew J. Beal,et al.  Variational Bayesian learning of directed graphical models with hidden variables , 2006 .

[70]  Balaji Lakshminarayanan,et al.  Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.

[71]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[72]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[73]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.