Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts

Accurate estimation of aleatoric and epistemic uncertainty is crucial to build safe and reliable systems. Traditional approaches, such as dropout and ensemble methods, estimate uncertainty by sampling probability predictions from different submodels, which leads to slow uncertainty estimation at inference time. Recent works address this drawback by directly predicting parameters of prior distributions over the probability predictions with a neural network. While this approach has demonstrated accurate uncertainty estimation, it requires defining arbitrary target parameters for in-distribution data and makes the unrealistic assumption that out-of-distribution (OOD) data is known at training time. In this work we propose the Posterior Network (PostNet), which uses Normalizing Flows to predict an individual closed-form posterior distribution over predicted probabilites for any input sample. The posterior distributions learned by PostNet accurately reflect uncertainty for in- and out-of-distribution data -- without requiring access to OOD data at training time. PostNet achieves state-of-the art results in OOD detection and in uncertainty calibration under dataset shifts.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Mohammad Norouzi,et al.  Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One , 2019, ICLR.

[3]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[4]  Eric Jang,et al.  Generative Ensembles for Robust Anomaly Detection , 2018, ArXiv.

[5]  Andrey Malinin,et al.  Ensemble Distribution Distillation , 2019, ICLR.

[6]  Alex Lamb,et al.  Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[7]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[8]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[9]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[10]  Will Grathwohl Scalable Reversible Generative Models with Free-form Continuous Dynamics , 2018 .

[11]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[12]  Mohammad Emtiyaz Khan,et al.  Practical Deep Learning with Bayesian Principles , 2019, NeurIPS.

[13]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Andrew Gordon Wilson,et al.  A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.

[17]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[18]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[19]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[20]  Eric T. Nalisnick,et al.  Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality , 2019 .

[21]  John Shawe-Taylor,et al.  A PAC analysis of a Bayesian estimator , 1997, COLT '97.

[22]  Murat Sensoy,et al.  Evidential Deep Learning to Quantify Classification Uncertainty , 2018, NeurIPS.

[23]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[24]  Andrey Malinin,et al.  Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness , 2019, NeurIPS.

[25]  Christos Faloutsos,et al.  The Power of Certainty: A Dirichlet-Multinomial Model for Belief Propagation , 2017, SDM.

[26]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[27]  Andrew Gordon Wilson,et al.  Why Normalizing Flows Fail to Detect Out-of-Distribution Data , 2020, NeurIPS.

[28]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[29]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[30]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[31]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[32]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Stephan Günnemann,et al.  Uncertainty on Asynchronous Time Event Prediction , 2019, NeurIPS.

[35]  Julien Cornebise,et al.  Weight Uncertainty in Neural Network , 2015, ICML.

[36]  Pier Giovanni Bissiri,et al.  A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[37]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.