Verification of operational solar flare forecast: Case of Regional Warning Center Japan

In this article, we discuss a verification study of an operational solar flare forecast in the Regional Warning Center (RWC) Japan. The RWC Japan has been issuing four-categorical deterministic solar flare forecasts for a long time. In this forecast verification study, we used solar flare forecast data accumulated over 16 years (from 2000 to 2015). We compiled the forecast data together with solar flare data obtained with the Geostationary Operational Environmental Satellites (GOES). Using the compiled data sets, we estimated some conventional scalar verification measures with 95% confidence intervals. We also estimated a multi-categorical scalar verification measure. These scalar verification measures were compared with those obtained by the persistence method and recurrence method. As solar activity varied during the 16 years, we also applied verification analyses to four subsets of forecast-observation pair data with different solar activity levels. We cannot conclude definitely that there are significant performance differences between the forecasts of RWC Japan and the persistence method, although a slightly significant difference is found for some event definitions. We propose to use a scalar verification measure to assess the judgment skill of the operational solar flare forecast. Finally, we propose a verification strategy for deterministic operational solar flare forecasting. For dichotomous forecast, a set of proposed verification measures is a frequency bias for bias, proportion correct and critical success index for accuracy, probability of detection for discrimination, false alarm ratio for reliability, Peirce skill score for forecast skill, and symmetric extremal dependence index for association. For multi-categorical forecast, we propose a set of verification measures as marginal distributions of forecast and observation for bias, proportion correct for accuracy, correlation coefficient and joint probability distribution for association, the likelihood distribution for discrimination, the calibration distribution for reliability and resolution, and the Gandin-Murphy-Gerrity score and judgment skill score for skill.

[1]  A. H. Murphy,et al.  A General Framework for Forecast Verification , 1987 .

[2]  Thomas M. Hamill,et al.  Measuring forecast skill: is it real skill or is it the varying climatology? , 2006 .

[3]  E. B. Wilson Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[4]  Cyclone Forecasts Verification methods for tropical cyclone forecasts , 2013 .

[5]  M. Crown,et al.  Validation of the NOAA Space Weather Prediction Center's solar flare forecasting look‐up table and forecaster‐issued probabilities , 2012 .

[6]  B. Efron,et al.  Bootstrap confidence intervals , 1996 .

[7]  P. Démoulin,et al.  The Magnetic Helicity Budget of a cme-Prolific Active Region , 2002 .

[8]  Joseph P. Gerrity,et al.  A note on Gandin and Murphy's equitable skill score , 1992 .

[9]  D. S. Bloomfield,et al.  TOWARD RELIABLE BENCHMARKING OF SOLAR FLARE FORECASTING METHODS , 2012, 1202.5995.

[10]  A. H. Murphy The Finley Affair: A Signal Event in the History of Forecast Verification , 1996 .

[11]  A. H. Murphy Forecast verification: Its Complexity and Dimensionality , 1991 .

[12]  I. Jolliffe Uncertainty and Inference for Verification Measures , 2007 .

[13]  D. Stephenson,et al.  Extremal Dependence Indices: Improved Verification Measures for Deterministic Forecasts of Rare Binary Events , 2011 .

[14]  D. S. Bloomfield,et al.  Flaring Rates and the Evolution of Sunspot Group McIntosh Classifications , 2016, 1607.00903.

[15]  A. H. Murphy,et al.  Equitable Skill Scores for Categorical Forecasts , 1992 .

[16]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[17]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[18]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[19]  D. Stephenson Use of the “Odds Ratio” for Diagnosing Forecast Skill , 2000 .

[20]  L. Driel-Gesztelyi,et al.  What is the source of the magnetic helicity shed by CMEs? The long-term helicity budget of AR 7978 , 2002 .

[21]  David B. Stephenson,et al.  The extreme dependency score: a non‐vanishing measure for forecasts of rare events , 2008 .

[22]  I. Jolliffe,et al.  Forecast verification : a practitioner's guide in atmospheric science , 2011 .

[23]  Andy Devos,et al.  Verification of space weather forecasting at the Regional Warning Center in Belgium , 2014 .