The Fundamental Nature of the Log Loss Function

The standard loss functions used in the literature on probabilistic prediction are the log loss function, the Brier loss function, and the spherical loss function; however, any computable proper loss function can be used for comparison of prediction algorithms. This note shows that the log loss function is most selective in that any prediction algorithm that is optimal for a given data sequence (in the sense of the algorithmic theory of randomness) under the log loss function will be optimal under any computable proper mixable loss function; on the other hand, there is a data sequence and a prediction algorithm that is optimal for that sequence under either of the two other standard loss functions but not under the log loss function.

[1]  Mark D. Reid,et al.  Generalized Mixability via Entropic Duality , 2014, COLT.

[2]  Vladimir Vovk,et al.  A Criterion for the Existence of Predictive Complexity for Binary Games , 2004, ALT.

[3]  J. Eric Bickel,et al.  Some Comparisons among Quadratic, Spherical, and Logarithmic Scoring Rules , 2007, Decis. Anal..

[4]  Yuri Gurevich,et al.  Impugning Randomness, Convincingly , 2012, Stud Logica.

[5]  David L. Dowe,et al.  Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence , 2013, Lecture Notes in Computer Science.

[6]  David L. Dowe,et al.  Introduction to Ray Solomonoff 85th Memorial Conference , 2011, Algorithmic Probability and Friends.

[7]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[8]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[9]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[10]  Yuri Kalnishkan,et al.  The Existence of Predictive Complexity and the Legendre Transformation , 2007 .

[11]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[12]  V. Vovk Competitive On‐line Statistics , 2001 .

[13]  Yuri Kalnishkan,et al.  Predictive Complexity for Games with Finite Outcome Spaces , 2015 .

[14]  David L. Dowe,et al.  Foreword re C. S. Wallace , 2008, Comput. J..

[15]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[16]  Vladimir Vovk Probability theory for the Brier game , 2001, Theor. Comput. Sci..