Uncertainty Characteristics Curves: A Systematic Assessment of Prediction Intervals

Accurate quantification of model uncertainty has long been recognized as a fundamental requirement for trusted AI. In regression tasks, uncertainty is typically quantified using prediction intervals calibrated to a specific operating point, making evaluation and comparison across different studies difficult. Our work leverages: (1) the concept of operating characteristics curves and (2) the notion of a gain over a simple reference, to derive a novel operating point agnostic assessment methodology for prediction intervals. The paper describes the corresponding algorithm, provides a theoretical analysis, and demonstrates its utility in multiple scenarios. We argue that the proposed method addresses the current need for comprehensive assessment of prediction intervals and thus represents a valuable addition to the uncertainty quantification toolbox.

[1]  Jinbo Bi,et al.  Regression Error Characteristic Curves , 2003, ICML.

[2]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[3]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[4]  H. Robinson Principles and Procedures of Statistics , 1961 .

[5]  M. Dwass Modified Randomization Tests for Nonparametric Hypotheses , 1957 .

[6]  David Lopez-Paz,et al.  Single-Model Uncertainties for Deep Learning , 2018, NeurIPS.

[7]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[8]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[9]  Eric Xing,et al.  Methods for comparing uncertainty quantifications for material property predictions. , 2019 .

[10]  A. Raftery,et al.  Probabilistic forecasts, calibration and sharpness , 2007 .

[11]  Maya R. Gupta,et al.  To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[12]  I. R. Dunsmore,et al.  A Bayesian Approach to Calibration , 1968 .

[13]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[14]  Peder A. Olsen,et al.  Crowd Counting with Decomposed Uncertainty , 2019, AAAI.

[15]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[16]  Soumya Ghosh,et al.  Quality of Uncertainty Quantification for Bayesian Neural Network Inference , 2019, ArXiv.

[17]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[18]  Robert L. Wolpert,et al.  Statistical Inference , 2019, Encyclopedia of Social Network Analysis and Mining.

[19]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[20]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[21]  Harikrishna Narasimhan,et al.  A Structural SVM Based Approach for Optimizing Partial AUC , 2013, ICML.

[22]  Tom Diethe,et al.  Distribution Calibration for Regression , 2019, ICML.

[23]  Richard E. Turner,et al.  On the Expressiveness of Approximate Inference in Bayesian Neural Networks , 2019, NeurIPS.

[24]  Tanmoy Bhattacharya,et al.  The need for uncertainty quantification in machine-assisted medical decision making , 2019, Nat. Mach. Intell..

[25]  Karthikeyan Shanmugam,et al.  Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes , 2018, AISTATS.

[26]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[27]  R. Koenker,et al.  Regression Quantiles , 2007 .

[28]  Kush R. Varshney,et al.  Increasing Trust in AI Services through Supplier's Declarations of Conformity , 2018, IBM J. Res. Dev..

[29]  Milos Hauskrecht,et al.  Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[30]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[31]  Jie Chen,et al.  Wind Power Forecasting Using Multi-Objective Evolutionary Algorithms for Wavelet Neural Network-Optimized Prediction Intervals , 2018 .

[32]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[33]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[34]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[35]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[36]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.