Probabilistic forecasts, calibration and sharpness

Summary.  Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with cross‐validation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.

[1]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[2]  Eric M. Aldrich,et al.  Calibrated Probabilistic Forecasting at the Stateline Wind Energy Center , 2006 .

[3]  C. Granger,et al.  Handbook of Economic Forecasting , 2006 .

[4]  Norman R. Swanson,et al.  Chapter 5 Predictive Density Evaluation , 2006 .

[5]  Clive W. J. Granger Preface: Some Thoughts on the Future of Forecasting , 2005 .

[6]  G. Shafer,et al.  Good randomized sequential probability forecasting is always possible , 2005 .

[7]  Adrian E. Raftery,et al.  Weather Forecasting with Ensemble Methods , 2005, Science.

[8]  Norman R. Swanson,et al.  Predictive Density Evaluation , 2005 .

[9]  O. Talagrand,et al.  Evaluation of probabilistic prediction systems for a scalar variable , 2005 .

[10]  Anton H. Westveld,et al.  Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation , 2005 .

[11]  A. Raftery,et al.  Using Bayesian Model Averaging to Calibrate Forecast Ensembles , 2005 .

[12]  C. Czado,et al.  Spatial modelling of claim frequency and claim size in insurance , 2005 .

[13]  Luc Bauwens,et al.  A Comparison of Financial Duration Models Via Density Forecast , 2004 .

[14]  Sylvia Früiiwirth-Schnatter,et al.  Recursive residuals and model diagnostics for normal and non-normal state space models , 1996, Environmental and Ecological Statistics.

[15]  John Bjørnar Bremnes,et al.  Probabilistic Forecasts of Precipitation in Terms of Quantiles Using NWP Model Output , 2004 .

[16]  Emanuela Marrocu,et al.  THE PERFORMANCE OF SETAR MODELS: A REGIME CONDITIONAL EVALUATION OF POINT, INTERVAL AND DENSITY FORECASTS , 2004 .

[17]  Anthony Garratt,et al.  Forecast Uncertainties in Macroeconomic Modeling , 2003 .

[18]  Stewart D. Hodges,et al.  An evaluation of tests of distributional forecasts , 2003 .

[19]  Alvaro Sandroni,et al.  Calibration with Many Checking Rules , 2003, Math. Oper. Res..

[20]  Leonard A. Smith,et al.  Combining dynamical and statistical ensembles , 2003 .

[21]  M Schumacher,et al.  How to Assess Prognostic Models for Survival Data: A Case Study in Oncology , 2003, Methods of Information in Medicine.

[22]  A. Papritz,et al.  An Empirical Comparison of Kriging Methods for Nonlinear Spatial Point Prediction , 2002 .

[23]  Tim N. Palmer,et al.  The economic value of ensemble forecasts as a tool for risk assessment: From days to decades , 2002 .

[24]  M. Roulston,et al.  Evaluating Probabilistic Forecasts Using Information Theory , 2002 .

[25]  Kenneth F. Wallis,et al.  Chi-Squared Tests of Interval and Density Forecasts, and the Bank of England's Fan Charts , 2001, SSRN Electronic Journal.

[26]  Jeremy Berkowitz Testing Density Forecasts, With Applications to Risk Management , 2001 .

[27]  G. Shafer,et al.  Probability and Finance: It's Only a Game! , 2001 .

[28]  T. Hamill Interpretation of Rank Histograms for Verifying Ensemble Forecasts , 2001 .

[29]  Anthony Garratt,et al.  Forecast Uncertainties in Macroeconometric Modelling: An Application to the UK Economy , 2000, SSRN Electronic Journal.

[30]  Roman Krzysztofowicz,et al.  Bayesian theory of probabilistic forecasting via deterministic hydrologic model , 1999 .

[31]  Roman Krzysztofowicz,et al.  Calibration of Probabilistic Quantitative Precipitation Forecasts , 1999 .

[32]  A. Dawid,et al.  Prequential probability: principles and properties , 1999 .

[33]  Michael P. Clements,et al.  Evaluating The Forecast of Densities of Linear and Non-Linear Models: Applications to Output Growth and Unemployment , 2000 .

[34]  Anthony S. Tay,et al.  Evaluating Density Forecasts with Applications to Financial Risk Management , 1998 .

[35]  Andreas S. Weigend,et al.  Predicting Daily Probability Distributions of S&P500 Returns , 1998 .

[36]  R. Selten Axiomatic Characterization of the Quadratic Scoring Rule , 1998 .

[37]  Thomas M. Hamill,et al.  Verification of Eta–RSM Short-Range Ensemble Forecasts , 1997 .

[38]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[39]  D. Duffie,et al.  An Overview of Value at Risk , 1997 .

[40]  Jeffrey L. Anderson A Method for Producing and Evaluating Probabilistic Forecasts from Ensemble Model Integrations , 1996 .

[41]  Sarah Brocklehurst,et al.  Techniques for prediction analysis and recalibration , 1996 .

[42]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[43]  J. Besag,et al.  Bayesian Computation and Stochastic Systems , 1995 .

[44]  Kerrie Mengersen,et al.  [Bayesian Computation and Stochastic Systems]: Rejoinder , 1995 .

[45]  Xiao-Li Meng,et al.  Posterior Predictive Assessment of Model Fitnessvia Realized , 1995 .

[46]  N. Shephard Partial non-Gaussian state space , 1994 .

[47]  F. Seillier-Moiseiwitsch Sequential Probability Forecasts and the Probability Integral Transform , 1993 .

[48]  A. H. Murphy,et al.  Diagnostic verification of probability forecasts , 1992 .

[49]  A. H. Murphy,et al.  Diagnostic Verification of Temperature Forecasts , 1989 .

[50]  M. Schervish A General Method for Comparing Probability Assessors , 1989 .

[51]  A. H. Murphy,et al.  A General Framework for Forecast Verification , 1987 .

[52]  A. Dawid Calibration-Based Empirical Probability , 1985 .

[53]  A. Dawid Rejoinder: Calibration-Based Empirical Probability , 1985 .

[54]  A. Dawid Self-Calibrating Priors Do Not Exist: Comment , 1985 .

[55]  Mark J. Schervish,et al.  Self-Calibrating Priors Do Not Exist: Comment , 1985 .

[56]  A. Dawid Comment: The Impossibility of Inductive Inference , 1985 .

[57]  David Oakes,et al.  Self-Calibrating Priors Do Not Exist , 1985 .

[58]  G. Blattenberger,et al.  Separating the Brier Score into Calibration and Refinement Components: A Graphical Exposition , 1985 .

[59]  Jim Q. Smith,et al.  Diagnostic checks of non‐standard time series models , 1985 .

[60]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[61]  A. H. Murphy,et al.  Time Series Models to Simulate and Forecast Wind Speed and Wind Power , 1984 .

[62]  Stephen E. Fienberg,et al.  The Comparison and Evaluation of Forecasters. , 1983 .

[63]  A. Dawid The Well-Calibrated Bayesian , 1982 .

[64]  M. Degroot,et al.  Assessing Probability Assessors: Calibration and Refinement. , 1981 .

[65]  J. Bernardo Expected Information as Expected Utility , 1979 .

[66]  Robert Goodall Brown,et al.  Decision Making and Change in Human Affairs , 1979 .

[67]  Robert L. Winkler Rewarding Expertise in Probability Assessment , 1977 .

[68]  A. H. Murphy,et al.  Scalar and Vector Partitions of the Probability Score: Part I. Two-State Situation , 1972 .

[69]  S. Holstein,et al.  Assessment and evaluation of subjective probability distributions , 1970 .

[70]  D. L. Hanson,et al.  On the strong law of large numbers for a class of stochastic processes , 1963 .

[71]  M. Rosenblatt Remarks on a Multivariate Transformation , 1952 .

[72]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[73]  K. Pearson ON A METHOD OF DETERMINING WHETHER A SAMPLE OF SIZE n SUPPOSED TO HAVE BEEN DRAWN FROM A PARENT POPULATION HAVING A KNOWN PROBABILITY INTEGRAL HAS PROBABLY BEEN DRAWN AT RANDOM , 1933 .