Fair scores for ensemble forecasts

The notion of fair scores for ensemble forecasts was introduced recently to reward ensembles with members that behave as though they and the verifying observation are sampled from the same distribution. In the case of forecasting binary outcomes, a characterization is given of a general class of fair scores for ensembles that are interpreted as random samples. This is also used to construct classes of fair scores for ensembles that forecast multicategory and continuous outcomes. The usual Brier, ranked probability and continuous ranked probability scores for ensemble forecasts are shown to be unfair, while adjusted versions of these scores are shown to be fair. A definition of fairness is also proposed for ensembles with members that are interpreted as being dependent and it is shown that fair scores exist only for some forms of dependence.

[1]  A. H. Murphy,et al.  Economic Value of Weather And Climate Forecasts: Forecast verification , 1997 .

[2]  J. Bröcker Evaluating raw ensembles with the continuous ranked probability score , 2012 .

[3]  Thomas M. Hamill,et al.  Verification of Eta–RSM Short-Range Ensemble Forecasts , 1997 .

[4]  Andreas P. Weigel,et al.  A Generic Forecast Verification Framework for Administrative Purposes , 2009 .

[5]  K. West,et al.  FORECAST EVALUATION , 2022 .

[6]  Jeffrey L. Anderson A Method for Producing and Evaluating Probabilistic Forecasts from Ensemble Model Integrations , 1996 .

[7]  T. A. Brown Admissible Scoring Systems for Continuous Distributions. , 1974 .

[8]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[9]  R. L. Winkler The Quantification of Judgment: Some Methodological Suggestions , 1967 .

[10]  Holger Kantz,et al.  The concept of exchangeability in ensemble forecasting , 2011 .

[11]  R. L. Winkler,et al.  Scoring Rules for Continuous Probability Distributions , 1976 .

[12]  David S. Richardson,et al.  On the effect of ensemble size on the discrete and continuous ranked probability scores , 2008 .

[13]  Leonard A. Smith,et al.  Scoring Probabilistic Forecasts: The Importance of Being Proper , 2007 .

[14]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[15]  D. Stephenson Use of the “Odds Ratio” for Diagnosing Forecast Skill , 2000 .

[16]  E. Sprokkereef,et al.  Verification of ensemble flow forecasts for the River Rhine , 2009 .

[17]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[18]  David B. Stephenson,et al.  Three recommendations for evaluating climate predictions , 2013 .

[19]  Nicholas E. Graham,et al.  Conditional Exceedance Probabilities , 2007 .

[20]  A. Raftery,et al.  Calibrating Multimodel Forecast Ensembles with Exchangeable and Missing Members Using Bayesian Model Averaging , 2010 .

[21]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[22]  David B. Stephenson,et al.  How Do We Know Whether Seasonal Climate Forecasts are Any Good , 2008 .

[23]  David B. Stephenson,et al.  Statistical methods for interpreting Monte Carlo ensemble forecasts , 2000 .

[24]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[25]  Christopher A. T. Ferro,et al.  Comparing Probabilistic Forecasting Systems with the Brier Score , 2007 .

[26]  D. Stephenson,et al.  Extremal Dependence Indices: Improved Verification Measures for Deterministic Forecasts of Rare Binary Events , 2011 .

[27]  J McCarthy,et al.  MEASURES OF THE VALUE OF INFORMATION. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[28]  T. Gneiting Making and Evaluating Point Forecasts , 2009, 0912.0902.