On the Practicality of Deterministic Epistemic Uncertainty

A set of novel approaches for estimating epistemic uncertainty in deep neural networks with a single forward pass has recently emerged as a valid alternative to Bayesian Neural Networks. On the premise of informative representations, these deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution (OOD) data while adding negligible computational costs at inference time. However, it remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications both prerequisites for their practical deployment. To this end, we first provide a taxonomy of DUMs, evaluate their calibration under continuous distributional shifts and their performance on OOD detection for image classification tasks. Then, we extend the most promising approaches to semantic segmentation. We find that, while DUMs scale to realistic vision tasks and perform well on OOD detection, the practicality of current methods is undermined by poor calibration under realistic distributional shifts.

[1]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[2]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[3]  Luc Van Gool,et al.  Spectral Tensor Train Parameterization of Deep Learning Layers , 2021, AISTATS.

[4]  Soheil Feizi,et al.  Bounding Singular Values of Convolution Layers , 2019, ArXiv.

[5]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[6]  Sebastian Nowozin,et al.  How Good is the Bayes Posterior in Deep Neural Networks Really? , 2020, ICML.

[7]  Philip M. Long,et al.  The Singular Values of Convolutional Layers , 2018, ICLR.

[8]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9]  Roland Siegwart,et al.  The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation , 2019, International Journal of Computer Vision.

[10]  Bernhard Schölkopf,et al.  Domain Generalization via Invariant Feature Representation , 2013, ICML.

[11]  Maximilian Baust,et al.  Learning in an Uncertain World: Representing Ambiguity Through Multiple Hypotheses , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[13]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[14]  Mike Wu,et al.  A Simple Framework for Uncertainty in Contrastive Learning , 2020, ArXiv.

[15]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[16]  Stephan Günnemann,et al.  Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts , 2020, NeurIPS.

[17]  Joost R. van Amersfoort,et al.  Simple and Scalable Epistemic Uncertainty Estimation Using a Single Deep Deterministic Neural Network , 2020, ICML 2020.

[18]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[19]  Ullrich Köthe,et al.  Analyzing Inverse Problems with Invertible Neural Networks , 2018, ICLR.

[20]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[21]  Kevin Smith,et al.  Bayesian Uncertainty Estimation for Batch Normalized Deep Networks , 2018, ICML.

[22]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[23]  Thomas B. Schön,et al.  Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Dustin Tran,et al.  BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning , 2020, ICLR.

[25]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[26]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[29]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[30]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[31]  Federico Tombari,et al.  Sampling-Free Epistemic Uncertainty Estimation Using Approximated Variance Propagation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Milos Hauskrecht,et al.  Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[33]  Daphna Weinshall,et al.  Distance-based Confidence Score for Neural Network Classifiers , 2017, ArXiv.

[34]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[35]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[36]  Yee Whye Teh,et al.  Hybrid Models with Deep and Invertible Features , 2019, ICML.

[37]  Alexander A. Alemi,et al.  Uncertainty in the Variational Information Bottleneck , 2018, ArXiv.

[38]  Carl E. Rasmussen,et al.  Rates of Convergence for Sparse Variational Gaussian Process Regression , 2019, ICML.

[39]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[40]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jean Daunizeau,et al.  Semi-analytical approximations to statistical moments of sigmoid and softmax mappings of normal variables , 2017, 1703.00091.

[42]  Arnold W. M. Smeulders,et al.  i-RevNet: Deep Invertible Networks , 2018, ICLR.

[43]  Federico Tombari,et al.  Batch Normalization Embeddings for Deep Domain Generalization , 2020, Pattern Recognit..

[44]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[45]  Philip H. S. Torr,et al.  Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty , 2021, ArXiv.

[46]  Miha Vuk,et al.  ROC curve, lift chart and calibration plot , 2006, Advances in Methodology and Statistics.

[47]  Jasper Snoek,et al.  Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors , 2020, ICML.

[48]  Guodong Zhang,et al.  Noisy Natural Gradient as Variational Inference , 2017, ICML.

[49]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[50]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[51]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[52]  Dustin Tran,et al.  Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness , 2020, NeurIPS.

[53]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[54]  Ullrich Kothe,et al.  Training Normalizing Flows with the Information Bottleneck for Competitive Generative Classification , 2020, NeurIPS.

[55]  Melih Kandemir,et al.  Sampling-Free Variational Inference of Bayesian Neural Networks by Variance Backpropagation , 2018, UAI.

[56]  Gilles Blanchard,et al.  Generalizing from Several Related Classification Tasks to a New Unlabeled Sample , 2011, NIPS.

[57]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[58]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[59]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[60]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[61]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[62]  Davide Scaramuzza,et al.  A General Framework for Uncertainty Estimation in Deep Learning , 2020, IEEE Robotics and Automation Letters.

[63]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[64]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[65]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[66]  Yarin Gal,et al.  Improving Deterministic Uncertainty Estimation in Deep Learning for Classification and Regression , 2021, ArXiv.

[67]  T. Weber,et al.  A case for new neural network smoothness constraints , 2020, ICBINB@NeurIPS.