Uncertainty in Neural Networks: Bayesian Ensembling

Understanding the uncertainty of a neural network's (NN) predictions is essential for many applications. The Bayesian framework provides a principled approach to this, however applying it to NNs is challenging due to the large number of parameters and data. Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian. In this work we propose one modification to the usual ensembling process that does result in Bayesian behaviour: regularising parameters about values drawn from a prior distribution. We provide theoretical support for this procedure as well as empirical evaluations on regression, image classification, and reinforcement learning problems.

[1]  Zheng Wen,et al.  Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..

[2]  Wolfram Burgard,et al.  The limits and potentials of deep learning for robotics , 2018, Int. J. Robotics Res..

[3]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[4]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[5]  Mohamed Zaki,et al.  High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach , 2018, ICML.

[6]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[7]  Benjamin Van Roy,et al.  Ensemble Sampling , 2017, NIPS.

[8]  Johnathan M. Bardsley,et al.  MCMC-Based Image Reconstruction with Uncertainty Quantification , 2012, SIAM J. Sci. Comput..

[9]  David Barber,et al.  A Scalable Laplace Approximation for Neural Networks , 2018, ICLR.

[10]  Stephen Tyree,et al.  Exact Gaussian Processes on a Million Data Points , 2019, NeurIPS.

[11]  Stuart J. Russell,et al.  Bayesian Q-Learning , 1998, AAAI/IAAI.

[12]  Albin Cassirer,et al.  Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.

[13]  Zoubin Ghahramani,et al.  Variational Bayesian dropout: pitfalls and fixes , 2018, ICML.

[14]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[15]  Christopher K. I. Williams Computing with Infinite Networks , 1996, NIPS.

[16]  Xi Cheng,et al.  Polynomial Regression As an Alternative to Neural Nets , 2018, ArXiv.

[17]  Didrik Nielsen,et al.  Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam , 2018, ICML.

[18]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[19]  Jishnu Mukhoti,et al.  On the Importance of Strong Baselines in Bayesian Deep Learning , 2018, ArXiv.

[20]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[21]  Heikki Haario,et al.  Randomize-Then-Optimize: A Method for Sampling from Posterior Distributions in Nonlinear Inverse Problems , 2014, SIAM J. Sci. Comput..

[22]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[23]  David J.C. MacKay Information theory, inference, and learning algoritms / David J.C. MacKay , 2003 .

[24]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[25]  Dean S. Oliver,et al.  An Iterative Ensemble Kalman Filter for Multiphase Fluid Flow Data Assimilation , 2007 .

[26]  Robert Tibshirani,et al.  A Comparison of Some Error Estimates for Neural Network Models , 1996, Neural Computation.

[27]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[28]  Ryan P. Adams,et al.  Early Stopping as Nonparametric Variational Inference , 2015, AISTATS.

[29]  Zhitao Gong,et al.  Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[31]  D. Oliver,et al.  Ensemble Randomized Maximum Likelihood Method as an Iterative Ensemble Smoother , 2011, Mathematical Geosciences.

[32]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[33]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[34]  Marcus Gallagher,et al.  Invariance of Weight Distributions in Rectified MLPs , 2017, ICML.

[35]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[36]  Daniel Hernández-Lobato,et al.  Deep Gaussian Processes for Regression using Approximate Expectation Propagation , 2016, ICML.

[37]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[38]  Tim Pearce,et al.  Uncertainty in Neural Networks: Approximately Bayesian Ensembling , 2018, AISTATS.

[39]  Chris Dyer,et al.  Pushing the bounds of dropout , 2018, ArXiv.

[40]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[41]  Yuhong Yang Information Theory, Inference, and Learning Algorithms. David J. C. MacKay , 2005 .

[42]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .