论文信息 - Post-hoc Calibration of Neural Networks

Post-hoc Calibration of Neural Networks

Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves. In recent years, there is a surge of research on neural network calibration and the majority of the works can be categorized into post-hoc calibration methods, defined as methods that learn an additional function to calibrate an already trained base network. In this work, we intend to understand the post-hoc calibration methods from a theoretical point of view. Especially, it is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained (Bishop, 1994). Nevertheless, it is not clear learning an additional function in a post-hoc manner would lead to calibration in the theoretical sense. To this end, we prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network $g \circ f$. This not only provides a less stringent condition to obtain a calibrated network but also provides a theoretical justification of post-hoc calibration methods. Our experiments on various image classification benchmarks confirm the theory.

[1] Cristian Sminchisescu,et al. Calibration of Neural Networks using Splines , 2020, ArXiv.

[2] Peter A. Flach,et al. Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers , 2017, AISTATS.

[3] Byron Boots,et al. Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks , 2020, NeurIPS.

[4] S. Srihari. Mixture Density Networks , 1994 .

[5] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[6] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[7] Bianca Zadrozny,et al. Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[8] AN Kolmogorov-Smirnov,et al. Sulla determinazione empírica di uma legge di distribuzione , 1933 .

[9] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[10] Peter A. Flach,et al. Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration , 2019, NeurIPS.

[11] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[12] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[13] A. Buja,et al. Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications , 2005 .

[14] Geoffrey E. Hinton,et al. When Does Label Smoothing Help? , 2019, NeurIPS.

[15] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[16] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Tengyu Ma,et al. Verified Uncertainty Calibration , 2019, NeurIPS.

[18] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.

[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Mark D. Reid,et al. Composite Binary Losses , 2009, J. Mach. Learn. Res..

[21] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.

[22] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.