论文信息 - Calibration with Bias-Corrected Temperature Scaling Improves Domain Adaptation Under Label Shift in Modern Neural Networks

Calibration with Bias-Corrected Temperature Scaling Improves Domain Adaptation Under Label Shift in Modern Neural Networks

Label shift refers to the phenomenon where the marginal probability p(y) of observing a particular class changes between the training and test distributions while the conditional probability p(x|y) stays fixed. This is relevant in settings such as medical diagnosis, where a classifier trained to predict disease based on observed symptoms may need to be adapted to a different distribution where the baseline frequency of the disease is higher. Given calibrated estimates of p(y|x), one can apply an EM algorithm to correct for the shift in class imbalance between the training and test distributions without ever needing to calculate p(x|y). Unfortunately, modern neural networks typically fail to produce well-calibrated probabilities, compromising the effectiveness of this approach. Although Temperature Scaling can greatly reduce miscalibration in these networks, it can leave behind a systematic bias in the probabilities that still poses a problem. To address this, we extend Temperature Scaling with class-specific bias parameters, which largely eliminates systematic bias in the calibrated probabilities and allows for effective domain adaptation under label shift. We term our calibration approach "Bias-Corrected Temperature Scaling". On experiments with CIFAR10, we find that EM with Bias-Corrected Temperature Scaling significantly outperforms both EM with Temperature Scaling and the recently-proposed Black-Box Shift Estimation.

Avanti Shrikumar | Anshul Kundaje | A. Kundaje | Avanti Shrikumar

[1] Percy Liang,et al. Calibrated Structured Prediction , 2015, NIPS.

[2] Stefano Ermon,et al. Estimating Uncertainty Online Against an Adversary , 2016, AAAI.

[3] Alexander J. Smola,et al. Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[4] Kamyar Azizzadenesheli,et al. Regularized Learning for Domain Adaptation under Label Shifts , 2019, ICLR.

[5] Ran El-Yaniv,et al. Selective Classification for Deep Neural Networks , 2017, NIPS.

[6] Stephen E. Fienberg,et al. The Comparison and Evaluation of Forecasters. , 1983 .

[7] Bernhard Schölkopf,et al. Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[8] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[9] Jacob Roll,et al. Evaluating model calibration in classification , 2019, AISTATS.

[10] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.

[11] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[12] Neil D. Lawrence,et al. When Training and Test Sets Are Different: Characterizing Learning Transfer , 2009 .

[13] Marco Saerens,et al. Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure , 2002, Neural Computation.

[14] Amos Storkey,et al. When Training and Test Sets are Different: Characterising Learning Transfer , 2013 .

[15] Zachary C. Lipton,et al. What is the Effect of Importance Weighting in Deep Learning? , 2018, ICML.

[16] Bernhard Schölkopf,et al. On causal and anticausal learning , 2012, ICML.

[17] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .