Calibration tests for multivariate Gaussian forecasts

Forecasts by nature should take the form of probabilistic distributions. Calibration, the statistical consistency of forecast distributions and observations, is a central property of good probabilistic forecasts. Calibration of univariate forecasts has been widely discussed, and significance tests are commonly used to investigate whether a prediction model is miscalibrated. However, calibration tests for multivariate forecasts are rare. In this paper, we propose calibration tests for multivariate Gaussian forecasts based on two types of the Dawid-Sebastiani score (DSS): the multivariate DSS (mDSS) and the individual DSS (iDSS). Analytic results and simulation studies show that the tests have sufficient power to detect miscalibrated forecasts with incorrect mean or incorrect variance. But for forecasts with incorrect correlation coefficients, only the tests based on mDSS are sensitive to miscalibration. As an illustration, we apply the methodology to weekly data on Norovirus disease incidence among males and females in Germany, in 2011-2014. The results further show that tests for multivariate forecasts are useful tools and superior to univariate calibration tests for correlated multivariate forecasts.

[1]  H. Hotelling The Generalization of Student’s Ratio , 1931 .

[2]  Michael Höhle,et al.  Estimating the under-reporting of norovirus illness in Germany utilizing enhanced awareness of diarrhoea during a large outbreak of Shiga toxin-producing E. coli O104:H4 in 2011 – a time series analysis , 2014, BMC Infectious Diseases.

[3]  Paola Sebastiani,et al.  Coherent dispersion criteria for optimal experimental design , 1999 .

[4]  L. Isserlis ON A FORMULA FOR THE PRODUCT-MOMENT COEFFICIENT OF ANY ORDER OF A NORMAL FREQUENCY DISTRIBUTION IN ANY NUMBER OF VARIABLES , 1918 .

[5]  Harry Vennema,et al.  Increase in viral gastroenteritis outbreaks in Europe and epidemic spread of new norovirus variant , 2004, The Lancet.

[6]  Anthony S. Tay,et al.  Multivariate Density Forecast Evaluation and Calibration In Financial Risk Management: High-Frequency Returns on Foreign Exchange , 1999, Review of Economics and Statistics.

[7]  R. Bellman Dynamic programming. , 1957, Science.

[8]  Subanar,et al.  A Second Correlation Method for Multivariate Exchange Rates Forecasting , 2014 .

[9]  Leonhard Held,et al.  Projecting the future burden of cancer: Bayesian age–period–cohort analysis with integrated nested Laplace approximations , 2017, Biometrical journal. Biometrische Zeitschrift.

[10]  M. Tanner,et al.  Outbreaks of gastroenteritis due to infections with Norovirus in Switzerland, 2001–2003 , 2005, Epidemiology and Infection.

[11]  T. Thorarinsdottir,et al.  Assessing the Calibration of High-Dimensional Ensemble Forecasts Using Rank Histograms , 2013, 1310.0236.

[12]  Thordis L. Thorarinsdottir,et al.  Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas , 2012, 1202.3956.

[13]  Morton B. Brown 400: A Method for Combining Non-Independent, One-Sided Tests of Significance , 1975 .

[14]  Michael P. Clements Evaluating Econometric Forecasts of Economic and Financial Variables , 2005 .

[15]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[16]  I. Jolliffe,et al.  Forecast verification : a practitioner's guide in atmospheric science , 2011 .

[17]  Anthony S. Tay,et al.  Evaluating Density Forecasts with Applications to Financial Risk Management , 1998 .

[18]  L Held,et al.  Predictive assessment of a non‐linear random effects model for multivariate time series of infectious disease counts , 2011, Statistics in medicine.

[19]  L. Held,et al.  Multivariate modelling of infectious disease surveillance data , 2008, Statistics in medicine.

[20]  K. Gibson,et al.  A Norovirus Outbreak at a Long-Term-Care Facility: The Role of Environmental Surface Contamination , 2005, Infection Control & Hospital Epidemiology.

[21]  Yong Bao,et al.  Comparing Density Forecast Models , 2007 .

[22]  Rory A. Fisher,et al.  Statistical Methods for Research Workers. , 1956 .

[23]  D. Cox Two further applications of a model for binary regression , 1958 .

[24]  P. Diggle,et al.  Analysis of Longitudinal Data , 2003 .

[25]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[26]  J. Bartko,et al.  Approximating the Negative Binomial , 1966 .

[27]  Nicholas E. Graham,et al.  Conditional Exceedance Probabilities , 2007 .

[28]  Leonhard Held,et al.  Spatio-Temporal Analysis of Epidemic Phenomena Using the R Package surveillance , 2014, ArXiv.

[29]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[30]  L. Held,et al.  Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds , 2008 .

[31]  T. Hamill,et al.  Variogram-Based Proper Scoring Rules for Probabilistic Forecasts of Multivariate Quantities* , 2015 .

[32]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[33]  Leonhard Held,et al.  Calibration tests for count data , 2014 .

[34]  L Held,et al.  A Score Regression Approach to Assess Calibration of Continuous Probabilistic Predictions , 2010, Biometrics.

[35]  A. Raftery,et al.  Probabilistic forecasts, calibration and sharpness , 2007 .

[36]  Ranjan K. Mallik,et al.  Some Properties of the Uniform Correlation Matrix and Their Applications , 2007, 2007 IEEE Wireless Communications and Networking Conference.