Simple and Scalable Epistemic Uncertainty Estimation Using a Single Deep Deterministic Neural Network

We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. Our approach, deterministic uncertainty quantification (DUQ), builds upon ideas of RBF networks. We scale training in these with a novel loss function and centroid updating scheme. By enforcing detectability of changes in the input using a gradient penalty, we are able to reliably detect out of distribution data. Our uncertainty quantification scales well to large datasets, and using a single model, we improve upon or match Deep Ensembles on notable difficult dataset pairs such as FashionMNIST vs. MNIST, and CIFAR-10 vs. SVHN, while maintaining competitive accuracy.

[1]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[2]  Harris Drucker,et al.  Improving generalization performance using double backpropagation , 1992, IEEE Trans. Neural Networks.

[3]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[6]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Peter Cheeseman,et al.  Bayesian Methods for Adaptive Models , 2011 .

[9]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[10]  Masashi Sugiyama,et al.  A least-squares approach to anomaly detection in static and sequential data , 2014, Pattern Recognit. Lett..

[11]  Julien Cornebise,et al.  Weight Uncertainty in Neural Network , 2015, ICML.

[12]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[14]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[17]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[18]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[19]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[20]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[21]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[22]  Zoubin Ghahramani,et al.  Adversarial Examples, Uncertainty, and Transfer Testing Robustness in Gaussian Process Hybrid Deep Networks , 2017, 1707.02476.

[23]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[24]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[25]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[26]  Arnold W. M. Smeulders,et al.  i-RevNet: Deep Invertible Networks , 2018, ICLR.

[27]  Asja Fischer,et al.  On the regularization of Wasserstein GANs , 2017, ICLR.

[28]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[29]  Andrew Slavin Ross,et al.  Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[30]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[31]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[32]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[33]  Yarin Gal,et al.  A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks , 2019, ArXiv.

[34]  Ioannis Mitliagkas,et al.  Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs , 2019, ArXiv.

[35]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[36]  Matthias Hein,et al.  Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yee Whye Teh,et al.  Hybrid Models with Deep and Invertible Features , 2019, ICML.

[38]  Judy Hoffman,et al.  Robust Learning with Jacobian Regularization , 2019, ArXiv.

[39]  Ioannis Mitliagkas,et al.  Gradient penalty from a maximum margin perspective. , 2020 .

[40]  Michael A. Osborne,et al.  Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning , 2019, AISTATS.

[41]  Uncertainty Estimation Using a Single Deep Deterministic Neural Network-ML Reproducibility Challenge 2020 , 2021 .

[42]  Yancong Deng,et al.  Few Shot Learning Based on the Street View House Numbers (SVHN) Dataset , 2021 .