A New Interpretation of the Weighted Kappa Coefficients

Reliability and agreement studies are of paramount importance. They do contribute to the quality of studies by providing information about the amount of error inherent to any diagnosis, score or measurement. Guidelines for reporting reliability and agreement studies were recently provided. While the use of the kappa-like family is advised for categorical and ordinal scales, no further guideline in the choice of a weighting scheme is given. In the present paper, a new simple and practical interpretation of the linear- and quadratic-weighted kappa coefficients is given. This will help researchers in motivating their choice of a weighting scheme.

[1]  E. Rogot,et al.  A proposed index for measuring agreement in test-retest studies. , 1966, Journal of chronic diseases.

[2]  S. Lipsitz,et al.  Methods for estimating the parameters of a linear model for ordered categorical data. , 1992, Biometrics.

[3]  S. Vanbelle Clinical Agreement in Qualitative Measurements , 2013 .

[4]  Matthijs J. Warrens,et al.  Some Paradoxical Results for the Quadratically Weighted Kappa , 2012 .

[5]  Jingyun Yang,et al.  Fixed-effects modeling of Cohen's weighted kappa for bivariate multinomial data , 2011, Comput. Stat. Data Anal..

[6]  J. Carlin,et al.  Bias, prevalence and kappa. , 1993, Journal of clinical epidemiology.

[7]  M. Warrens Weighted Kappas for 3 × 3 Tables , 2013 .

[8]  H. Brenner,et al.  Dependence of Weighted Kappa Coefficients on the Number of Categories , 1996, Epidemiology.

[9]  M. Warrens Conditional inequalities between Cohen's kappa and weighted kappas , 2013 .

[10]  M. Warrens Weighted Kappas for Tables , 2013 .

[11]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[12]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[13]  Karl Pearson,et al.  Mathematical contributions to the theory of evolution. VIII. On the correlation of characters not quantitatively measurable , 1900, Proceedings of the Royal Society of London.

[14]  M. Potter,et al.  Resolving the paradoxes , 2008 .

[15]  A. Feinstein,et al.  High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.

[16]  V. Chinchilli,et al.  Fixed-Effects Modeling of Cohen's Kappa for Bivariate Multinomial Data , 2009 .

[17]  Adelin Albert,et al.  A note on the linearly weighted kappa coefficient for ordinal scales , 2009 .

[18]  Russell A. Poldrack,et al.  Guidelines for reporting an fMRI study , 2008, NeuroImage.

[19]  A. Hrõbjartsson,et al.  Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. , 2011, Journal of clinical epidemiology.

[20]  W. W. Stine Interobserver relational agreement , 1989 .

[21]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[22]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[23]  T. Allison,et al.  A New Procedure for Assessing Reliability of Scoring EEG Sleep Recordings , 1971 .

[24]  Matthijs J. Warrens,et al.  Cohen's quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables , 2012 .

[25]  Matthijs J. Warrens,et al.  Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas , 2011 .

[26]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[27]  Christof Schuster,et al.  A Note on the Interpretation of Weighted Kappa and its Relations to Other Rater Agreement Statistics for Metric Scales , 2004 .

[28]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[29]  Matthijs J. Warrens,et al.  Corrected Zegers-ten Berge Coefficients Are Special Cases of Cohen’s Weighted Kappa , 2014, Journal of Classification.

[30]  K. McGraw,et al.  Forming inferences about some intraclass correlation coefficients. , 1996 .

[31]  Matthijs J. Warrens,et al.  The Cicchetti-Allison weighting matrix is positive definite , 2013, Comput. Stat. Data Anal..

[32]  Werner Vach,et al.  The dependence of Cohen's kappa on the prevalence does not matter. , 2005, Journal of clinical epidemiology.