Simple and Accurate Uncertainty Quantification from Bias-Variance Decomposition

Accurate uncertainty quantification is crucial for many applications where decisions are in play. Examples include medical diagnosis and self-driving vehicles. We propose a new method that is based directly on the bias-variance decomposition, where the parameter uncertainty is given by the variance of an ensemble divided by the number of members in the ensemble, and the aleatoric uncertainty plus the squared bias is estimated by training a separate model that is regressed directly on the errors of the predictor. We demonstrate that this simple sequential procedure provides much more accurate uncertainty estimates than the current state-of-the-art on two MRI reconstruction tasks.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[3]  Antonio Criminisi,et al.  Bayesian Image Quality Transfer with CNNs: Exploring Uncertainty in dMRI Super-Resolution , 2017, MICCAI.

[4]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[5]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[6]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[7]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[8]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[9]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[10]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[11]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[12]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[13]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[14]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[15]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[16]  Max Welling,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS 2015.

[17]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[18]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[19]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[22]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[23]  Stefan Roth,et al.  Benchmarking Denoising Algorithms with Real Photographs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[25]  Nikolay Laptev,et al.  Deep and Confident Prediction for Time Series at Uber , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[26]  Gilles Louppe,et al.  Understanding Random Forests: From Theory to Practice , 2014, 1407.7502.

[27]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[28]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[29]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[30]  C. Bishop Mixture density networks , 1994 .

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[34]  Klaus H. Maier-Hein,et al.  A Probabilistic U-Net for Segmentation of Ambiguous Images , 2018, NeurIPS.

[35]  Mariya Doneva,et al.  Adaptive-CS-Net: FastMRI with Adaptive Intelligence , 2019, 1912.12259.

[36]  Alexander J. Smola,et al.  Heteroscedastic Gaussian process regression , 2005, ICML.

[37]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[38]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[39]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[40]  Henkjan Huisman,et al.  Supervised Uncertainty Quantification for Segmentation with Multiple Annotations , 2019, MICCAI.

[41]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[42]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[43]  Mariya Doneva,et al.  An Adaptive Intelligence Algorithm for Undersampled Knee MRI Reconstruction , 2020, IEEE Access.

[44]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[45]  Horst Bischof,et al.  Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).