Bin-wise Temperature Scaling (BTS): Improvement in Confidence Calibration Performance through Simple Scaling Techniques

The prediction reliability of neural networks is important in many applications. Specifically, in safety-critical domains, such as cancer prediction or autonomous driving, a reliable confidence of model’s prediction is critical for the interpretation of the results. Modern deep neural networks have achieved a significant improvement in performance for many different image classification tasks. However, these networks tend to be poorly calibrated in terms of output confidence. Temperature scaling is an efficient post-processing-based calibration scheme and obtains well calibrated results. In this study, we leverage the concept of temperature scaling to build a sophisticated bin-wise scaling. Furthermore, we adopt augmentation of validation samples for elaborated scaling. The proposed methods consistently improve calibration performance with various datasets and deep convolutional neural network models.

[1]  Ran El-Yaniv,et al.  Deep Anomaly Detection Using Geometric Transformations , 2018, NeurIPS.

[2]  Bohyung Han,et al.  Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[4]  Geoffrey E. Hinton,et al.  Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Sunita Sarawagi,et al.  Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings , 2018, ICML.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[11]  Stephen E. Fienberg,et al.  The Comparison and Evaluation of Forecasters. , 1983 .

[12]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[13]  Luc Van Gool,et al.  Seven Ways to Improve Example-Based Single Image Super Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[15]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[16]  Milos Hauskrecht,et al.  Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.