Proper Scoring Rules and Bregman Divergences

We revisit the mathematical foundations of proper scoring rules (PSRs) and Bregman divergences and present their characteristic properties in a unified theoretical framework. In many situations it is preferable not to generate a PSR directly from its convex entropy on the unit simplex but instead by the sublinear extension of the entropy to the positive orthant. This gives the scoring rule simply as a subgradient of the extended entropy, allowing for a more elegant theory. The other convex extensions of the entropy generate affine extensions of the scoring rule and induce the class of functional Bregman divergences. We discuss the geometric nature of the relationship between PSRs and Bregman divergences and extend and unify existing partial results. We also approach the topic of differentiability of entropy functions. Not all entropies of interest possess functional derivatives, but they do all have directional derivatives in almost every direction. Relying on the notion of quasi-interior of a convex set to quantify the latter property, we formalise under what conditions a PSR may be considered to be uniquely determined from its entropy.

[1]  A. Dawid,et al.  Theory and applications of proper scoring rules , 2014, 1401.0398.

[2]  S. Lauritzen,et al.  Proper local scoring rules , 2011, 1101.5011.

[3]  Richard S. Hamilton,et al.  The inverse function theorem of Nash and Moser , 1982 .

[4]  M. C. Jones,et al.  Robust and efficient estimation by minimising a density power divergence , 1998 .

[5]  Jonathan M. Borwein,et al.  Notions of Relative Interior in Banach Spaces , 2003 .

[6]  Nicolas S. Lambert Elicitation and Evaluation of Statistical Forecasts , 2010 .

[7]  H. Vincent Poor,et al.  Probabilistic Coherence and Proper Scoring Rules , 2007, IEEE Transactions on Information Theory.

[8]  Jonathan M. Borwein,et al.  Partially finite convex programming, Part I: Quasi relative interiors and duality theory , 1992, Math. Program..

[9]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[10]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[11]  Takafumi Kanamori,et al.  Affine invariant divergences associated with proper composite scoring rules and their applications , 2014 .

[12]  J. Brocker Reliability, Sufficiency, and the Decomposition of Proper Scores , 2008, 0806.0813.

[13]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[14]  Steffen Lauritzen,et al.  Linear estimating equations for exponential families with application to Gaussian linear concentration models , 2013, 1311.0662.

[15]  Johanna F. Ziegel,et al.  Higher order elicitability and Osband’s principle , 2015, 1503.08123.

[16]  Xin Guo,et al.  On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[17]  Steffen Lauritzen,et al.  PROPER LOCAL SCORING RULES ON DISCRETE SAMPLE SPACES , 2011, 1104.2224.

[18]  Rafael M. Frongillo Eliciting Private Information from Selfish Agents , 2013 .

[19]  Siyu Zhang,et al.  Elicitation and Identification of Properties , 2014, COLT.

[20]  R. Nau Should Scoring Rules be Effective , 1985 .

[21]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[23]  J. Borwein,et al.  Convex Functions: Constructions, Characterizations and Counterexamples , 2010 .

[24]  S. Eguchi,et al.  Robust parameter estimation with a small bias against heavy contamination , 2008 .

[25]  Maya R. Gupta,et al.  Functional Bregman Divergence and Bayesian Estimation of Distributions , 2006, IEEE Transactions on Information Theory.

[26]  Robert C. Williamson,et al.  The Geometry of Losses , 2014, COLT.

[27]  Jonathan M. Borwein,et al.  Applications of convex analysis within mathematics , 2013, Math. Program..

[28]  Heinz H. Bauschke,et al.  Joint and Separate Convexity of the Bregman Distance , 2001 .

[29]  Leon Hirsch,et al.  Fundamentals Of Convex Analysis , 2016 .

[30]  Jacob D. Abernethy,et al.  A Characterization of Scoring Rules for Linear Properties , 2012, COLT.

[31]  Johanna F. Ziegel,et al.  COHERENCE AND ELICITABILITY , 2013, 1303.1690.

[32]  Tilmann Gneiting,et al.  Local proper scoring rules of order two , 2011, 1102.5031.

[33]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[34]  Ian A. Kash,et al.  General Truthfulness Characterizations Via Convex Analysis , 2012, WINE.

[35]  Aapo Hyvärinen,et al.  Some extensions of score matching , 2007, Comput. Stat. Data Anal..

[36]  J McCarthy,et al.  MEASURES OF THE VALUE OF INFORMATION. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[37]  C. Zălinescu Convex analysis in general vector spaces , 2002 .

[38]  P. K. Jain CONVEX FUNCTIONS AND THEIR APPLICATIONS , 1968 .

[39]  T. Kanamori,et al.  Robust Estimation under Heavy Contamination using Enlarged Models , 2013, 1311.5301.

[40]  David Lindley Scoring rules and the inevitability of probability , 1982 .

[41]  A. Hendrickson,et al.  Proper Scores for Probability Forecasters , 1971 .

[42]  Evgeni Y. Ovcharov,et al.  Existence and uniqueness of proper scoring rules , 2015, J. Mach. Learn. Res..