Individual Calibration with Randomized Forecasting

Machine learning applications often require calibrated predictions, e.g. a 90\% credible interval should contain the true outcome 90\% of the times. However, typical definitions of calibration only require this to hold on average, and offer no guarantees on predictions made on individual samples. Thus, predictions can be systematically over or under confident on certain subgroups, leading to issues of fairness and potential vulnerabilities. We show that calibration for individual samples is possible in the regression setup if the predictions are randomized, i.e. outputting randomized credible intervals. Randomization removes systematic bias by trading off bias with variance. We design a training objective to enforce individual calibration and use it to train randomized regression functions. The resulting models are more calibrated for arbitrarily chosen subgroups of the data, and can achieve higher utility in decision making against adversaries that exploit miscalibrated predictions.

[1]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[2]  Aaron Roth,et al.  Average Individual Fairness: Algorithms, Generalization and Experiments , 2019, NeurIPS.

[3]  Stefano Ermon,et al.  Calibrated Model-Based Deep Reinforcement Learning , 2019, ICML.

[4]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[5]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[6]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[7]  E. Candès,et al.  The limits of distribution-free conditional predictive inference , 2019, Information and Inference: A Journal of the IMA.

[8]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[9]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[10]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[11]  Celestine Mendler-Dünner,et al.  Performative Prediction , 2020, ICML.

[12]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[13]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[14]  Guy N. Rothblum,et al.  Calibration for the (Computationally-Identifiable) Masses , 2017, ArXiv.

[15]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[16]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[17]  Alexander S. Poznyak,et al.  A Stackelberg security game with random strategies based on the extraproximal theoretic approach , 2015, Eng. Appl. Artif. Intell..

[18]  A. Raftery,et al.  Probabilistic forecasts, calibration and sharpness , 2007 .

[19]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[20]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[21]  Stefano Ermon,et al.  Learning Controllable Fair Representations , 2018, AISTATS.

[22]  Max Simchowitz,et al.  The Implicit Fairness Criterion of Unconstrained Learning , 2018, ICML.

[23]  R. H. Myers Classical and modern regression with applications , 1986 .

[24]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[25]  P. Fishburn,et al.  TWO‐PIECE VON NEUMANN‐MORGENSTERN UTILITY FUNCTIONS* , 1979 .

[26]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[27]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[28]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[29]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[30]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[31]  Ben J. Marafino,et al.  Creating Fair Models of Atherosclerotic Cardiovascular Disease Risk , 2018, AIES.

[32]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[33]  Suresh Venkatasubramanian,et al.  On the (im)possibility of fairness , 2016, ArXiv.

[34]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[35]  Ethan Fetaya,et al.  Evaluating and Calibrating Uncertainty Prediction in Regression Tasks , 2019, Sensors.

[36]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[37]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[38]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[39]  Milind Tambe,et al.  Urban Security: Game-Theoretic Resource Allocation in Networked Domains , 2010, AAAI.

[40]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.