Application of tetrachoric and polychoric correlation coefficients to forecast verification

The measure of association in 2 2( K K) contingency tables known as tetrachoric (polychoric) correlation coefficient is recalled. These measures rely on two assumptions: 1) there exist continuous latent variables underlying the contingency table and 2) joint distribution of corresponding standard normal deviates is bivariate normal. It is shown that, in practice, the tetrachoric (polychoric) correlation coefficient is an estimate of Pearson correlation coefficient between the latent variables. Consequently, these measures do not depend on bias nor on marginal frequencies of the table, which implies a natural and convenient partition of information (carried by the contingency table), between association, bias and probability of the event and subsequently enables the analysis of how other scores depend on bias and marginal frequencies. Results extended to K K tables lead to eventual reduction in dimensionality from K2 to 2K. The theoretical findings are illustrated through analysis of real-life, 6 6 contingency tables on verification of quantitative precipitation forecasts.

[1]  M. Ward,et al.  Prediction of seasonal rainfall in the north nordeste of Brazil using eigenvectors of sea‐surface temperature , 2007 .

[2]  Gertrude Blanch,et al.  Tables of the Bivariate Normal Distribution Function and Related Functions. , 1960 .

[3]  G. Brier,et al.  Some applications of statistics to meteorology , 1958 .

[4]  F. Woodcock,et al.  The Evaluation of Yes/No Forecasts for Scientific and Administrative Purposes , 1976 .

[5]  C. Marzban Scalar measures of performance in rare-event situations , 1998 .

[6]  C. Doswell,et al.  On Summary Measures of Skill in Rare Event Forecasting Based on Contingency Tables , 1990 .

[7]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[8]  Karl Pearson,et al.  Mathematical contributions to the theory of evolution. VIII. On the correlation of characters not quantitatively measurable , 1900, Proceedings of the Royal Society of London.

[9]  D. Stephenson Use of the “Odds Ratio” for Diagnosing Forecast Skill , 2000 .

[10]  A. H. Murphy,et al.  Skill Scores and Correlation Coefficients in Model Verification , 1989 .

[11]  Ian T. Jolliffe,et al.  Revised “LEPS” Scores for Assessing Climate Model Simulations and Long-Range Forecasts , 1996 .

[12]  Morton B. Brown Algorithm AS 116: The Tetrachoric Correlation and its Asymptotic Standard Error , 1977 .

[13]  Joseph P. Gerrity,et al.  A note on Gandin and Murphy's equitable skill score , 1992 .

[14]  N. Carruthers,et al.  Handbook of Statistical Methods in Meteorology , 1952 .

[15]  A. H. Murphy The Finley Affair: A Signal Event in the History of Forecast Verification , 1996 .

[16]  Irving I. Gringorten Modelling Conditional Probability , 1971 .

[17]  On the Uniqueness of Gandin and Murphy’s Equitable Performance Measures , 1999 .

[18]  W. Sheppard On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation , 1899 .

[19]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[20]  Ulf Olsson,et al.  Maximum likelihood estimation of the polychoric correlation coefficient , 1979 .

[21]  M. A. Hamdan,et al.  Maximum likelihood and some other asymptotically efficient estimators of correlation in two way contingency tables , 1972 .

[22]  A. H. Murphy,et al.  Equitable Skill Scores for Categorical Forecasts , 1992 .

[23]  M. A. Hamdan The equivalence of tetrachoric and maximum likelihood estimates of p in 2 × 2 tables , 1970 .

[24]  A. H. Murphy,et al.  The Coefficients of Correlation and Determination as Measures of performance in Forecast Verification , 1995 .

[25]  A. Barnston Correspondence among the correlation, RMSE, and Heidke forecast verification measures; refinement of the Heidke score , 1992 .

[26]  Karl Pearson,et al.  On Theories of Association , 1913 .

[27]  Samuel Kotz,et al.  Discrete Distributions: Distributions in Statistics , 1971 .

[28]  Josip Juras Comparison of Models for Estimating the Joint Probability of a Weather Event , 1982 .