Adaptive calibration for binary classification

This note proposes a way of making probability forecasting rules less sensitive to changes in data distribution, concentrating on the simple case of binary classification. This is important in applications of machine learning, where the quality of a trained predictor may drop significantly in the process of its exploitation. Our techniques are based on recent work on conformal test martingales and older work on prediction with expert advice, namely tracking the best expert. The version of this paper at http://alrw.net (Working Paper 35) is updated most often.

[1]  Vladimir Vovk,et al.  Testing Randomness Online , 2019, Statistical Science.

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  D. Cox Two further applications of a model for binary regression , 1958 .

[4]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[5]  V. Vovk Competitive On‐line Statistics , 2001 .

[6]  Vladimir Vovk Enhancement of prediction algorithms by betting , 2021, ArXiv.

[7]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[8]  Vladimir Vovk,et al.  Game‐Theoretic Foundations for Probability and Finance , 2019, Wiley Series in Probability and Statistics.

[9]  J. Cavanaugh,et al.  Partial Likelihood , 2018, Wiley StatsRef: Statistics Reference Online.

[10]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[11]  Thomas S. Ferguson,et al.  On the Rejection of Outliers , 1961 .

[12]  Vladimir Vovk,et al.  Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.