Distribution-Free, Risk-Controlling Prediction Sets

While improving prediction accuracy has been the focus of machine learning in recent years, this alone does not suffice for reliable decision-making. Deploying learning systems in consequential settings also requires calibrating and communicating the uncertainty of predictions. To convey instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions from a black-box predictor that controls the expected loss on future test points at a user-specified level. Our approach provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets. This framework enables simple, distribution-free, rigorous error control for many tasks, and we demonstrate it in five large-scale machine learning problems: (1) classification problems where some mistakes are more costly than others; (2) multi-label classification, where each observation has multiple associated labels; (3) classification problems where the labels have a hierarchical structure; (4) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (5) protein structure prediction. Last, we discuss extensions to uncertainty quantification for ranking, metric learning, and distributionally robust learning.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Ryan J. Tibshirani,et al.  Predictive inference with the jackknife+ , 2019, The Annals of Statistics.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[5]  Sébastien Marcel,et al.  Torchvision the machine-vision package of torch , 2010, ACM Multimedia.

[6]  Leying Guan,et al.  Prediction and outlier detection in classification problems , 2019, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[7]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[8]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[9]  Larry A. Wasserman,et al.  A conformal prediction approach to explore functional data , 2013, Annals of Mathematics and Artificial Intelligence.

[10]  S. S. Wilks Determination of Sample Sizes for Setting Tolerance Limits , 1941 .

[11]  Aymeric Histace,et al.  Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer , 2014, International Journal of Computer Assisted Radiology and Surgery.

[12]  Ling Shao,et al.  PraNet: Parallel Reverse Attention Network for Polyp Segmentation , 2020, MICCAI.

[13]  Harris Papadopoulos,et al.  Inductive Confidence Machines for Regression , 2002, ECML.

[14]  Emmanuel J. Candès,et al.  Conformal Prediction Under Covariate Shift , 2019, NeurIPS.

[15]  Michael I. Jordan,et al.  Uncertainty Sets for Image Classifiers using Conformal Prediction , 2021, ICLR.

[16]  Barnabás Póczos,et al.  Cautious Deep Learning , 2018, ArXiv.

[17]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[18]  Aaditya Ramdas,et al.  Variance-adaptive confidence sequences by betting , 2020 .

[19]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[20]  Andreas Maurer,et al.  Concentration inequalities for functions of independent variables , 2006, Random Struct. Algorithms.

[21]  L. J. Savage,et al.  The nonexistence of certain statistical procedures in nonparametric problems , 1956 .

[22]  Juan José del Coz,et al.  Learning Nondeterministic Classifiers , 2009, J. Mach. Learn. Res..

[23]  Sergey Utev,et al.  Exact Exponential Bounds for Sums of Independent Random Variables , 1990 .

[24]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[25]  Eyke Hüllermeier,et al.  Efficient set-valued prediction in multi-class classification , 2019, Data Mining and Knowledge Discovery.

[26]  Itamar Friedman,et al.  TResNet: High Performance GPU-Dedicated Architecture , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27]  Larry A. Wasserman,et al.  Least Ambiguous Set-Valued Classifiers With Bounded Error Levels , 2016, Journal of the American Statistical Association.

[28]  John C. Duchi,et al.  Knowing what You Know: valid and validated confidence sets in multiclass and multilabel prediction , 2020, J. Mach. Learn. Res..

[29]  S. S. Wilks Statistical Prediction with Special Reference to the Problem of Tolerance Limits , 1942 .

[30]  Michael Riegler,et al.  KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection , 2017, MMSys.

[31]  Lujia Zhang,et al.  Structural insights into enzymatic activity and substrate specificity determination by a single amino acid in nitrilase from Syechocystis sp. PCC6803. , 2014, Journal of structural biology.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Vladimir Vovk,et al.  Cross-conformal predictors , 2012, Annals of Mathematics and Artificial Intelligence.

[35]  Fernando Vilariño,et al.  Towards automatic polyp detection with a polyp appearance model , 2012, Pattern Recognit..

[36]  Yaniv Romano,et al.  Conformalized Quantile Regression , 2019, NeurIPS.

[37]  Chirag Gupta,et al.  Nested conformal prediction and quantile out-of-bag ensemble methods , 2019, Pattern Recognit..

[38]  Leying Guan,et al.  Conformal prediction with localization , 2019, 1908.08558.

[39]  Insup Lee,et al.  PAC Confidence Predictions for Deep Neural Network Classifiers , 2020, ArXiv.

[40]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[41]  J. Tukey Non-Parametric Estimation II. Statistically Equivalent Blocks and Tolerance Regions--The Continuous Case , 1947 .

[42]  Yaniv Romano,et al.  Classification with Valid and Adaptive Coverage , 2020, NeurIPS.

[43]  Thomas Mathew,et al.  Statistical Tolerance Regions: Theory, Applications, and Computation , 2009 .

[44]  Insup Lee,et al.  PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction , 2020, ICLR.

[45]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[46]  Xiaoyu Hu,et al.  A Distribution-Free Test of Covariate Shift Using Conformal Prediction , 2020 .

[47]  John C. Duchi,et al.  Robust Validation: Confident Predictions Even When Distributions Shift , 2020, ArXiv.

[48]  Rafael Izbicki,et al.  Distribution-free conditional predictive bands using density estimators , 2020, AISTATS.

[49]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[50]  Abraham Wald,et al.  An Extension of Wilks' Method for Setting Tolerance Limits , 1943 .

[51]  Paul F. Whelan,et al.  Efficient morphological reconstruction: a downhill filter , 2004, Pattern Recognit. Lett..

[52]  V. Bentkus On Hoeffding’s inequalities , 2004, math/0410159.

[53]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[54]  Alexander Gammerman,et al.  Conformal calibrators , 2019, COPA.

[55]  Mathias Lux,et al.  HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy , 2020, Scientific data.

[56]  E. Grycko Classification with Set-Valued Decision Functions , 1993 .

[57]  Dilip V. Sarwate,et al.  Computing connected components on parallel computers , 1979, CACM.

[58]  Emmanuel J. Candès,et al.  Conformal inference of counterfactuals and individual treatment effects , 2020, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[59]  A E Bostwick,et al.  THE THEORY OF PROBABILITIES. , 1896, Science.

[60]  Jing Lei Classification with confidence , 2014 .

[61]  E. Candès,et al.  The limits of distribution-free conditional predictive inference , 2019, Information and Inference: A Journal of the IMA.