Cohen’s linearly weighted kappa is a weighted average

An n × n agreement table F = {fij} with n ≥ 3 ordered categories can for fixed m (2 ≤ m ≤ n − 1) be collapsed into $${\binom{n-1}{m-1}}$$ distinct m × m tables by combining adjacent categories. It is shown that the components (observed and expected agreement) of Cohen’s weighted kappa with linear weights can be obtained from the m × m subtables. A consequence is that weighted kappa with linear weights can be interpreted as a weighted average of the linearly weighted kappas corresponding to the m × m tables, where the weights are the denominators of the kappas. Moreover, weighted kappa with linear weights can be interpreted as a weighted average of the linearly weighted kappas corresponding to all nontrivial subtables.

[1]  Kenneth J. Berry,et al.  A note on Cohen’s weighted kappa coefficient of agreement with linear weights , 2009 .

[2]  Matthijs J. Warrens,et al.  Weighted kappa is higher than Cohen's kappa for tridiagonal agreement tables , 2011 .

[3]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[4]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[5]  Matthijs J. Warrens Inequalities Between Kappa and Kappa-Like Statistics for k×k Tables , 2010 .

[6]  Rebecca Zwick,et al.  Another look at interrater agreement. , 1988, Psychological bulletin.

[7]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[8]  H. Kundel,et al.  Measurement of observer agreement. , 2003, Radiology.

[9]  Matthijs J. Warrens,et al.  Inequalities between multi-rater kappas , 2010, Adv. Data Anal. Classif..

[10]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[11]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[12]  Ulf Olsson,et al.  A Measure of Agreement for Interval or Nominal Multivariate Observations , 2001 .

[13]  M. Warrens On Similarity Coefficients for 2×2 Tables and Correction for Chance , 2008, Psychometrika.

[14]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[15]  Albert Westergren,et al.  Statistical methods for assessing agreement for ordinal data. , 2005, Scandinavian journal of caring sciences.

[16]  A. J. Conger Integration and generalization of kappas for multiple raters. , 1980 .

[17]  Roel Popping,et al.  Some views on agreement to be used in content analysis studies , 2010 .

[18]  J C Nelson,et al.  Statistical description of interrater variability in ordinal ratings , 2000, Statistical methods in medical research.

[19]  Kenneth J. Berry,et al.  The Exact Variance of Weighted Kappa with Multiple Raters , 2007, Psychological reports.

[20]  Hubert J. A. Schouten,et al.  Nominal scale agreement among observers , 1986 .

[21]  Dale J. Prediger,et al.  Coefficient Kappa: Some Uses, Misuses, and Alternatives , 1981 .

[22]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[23]  Janis E. Johnston,et al.  Resampling Probability Values for Weighted Kappa with Multiple Raters , 2008, Psychological reports.

[24]  Adelin Albert,et al.  A note on the linearly weighted kappa coefficient for ordinal scales , 2009 .

[25]  W. A. Scott,et al.  Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .

[26]  Matthijs J. Warrens,et al.  Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas , 2011 .

[27]  Christof Schuster,et al.  A Note on the Interpretation of Weighted Kappa and its Relations to Other Rater Agreement Statistics for Metric Scales , 2004 .

[28]  Matthijs J. Warrens,et al.  Cohen's quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables , 2012 .

[29]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[30]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[31]  Matthijs J. Warrens,et al.  A Kraemer-type Rescaling that Transforms the Odds Ratio into the Weighted Kappa Coefficient , 2010 .

[32]  Matthijs J. Warrens,et al.  Cohen's kappa is a weighted average , 2011 .

[33]  Matthijs J. Warrens,et al.  Cohen's kappa can always be increased and decreased by combining categories , 2010 .

[34]  Matthijs J. Warrens,et al.  k-Adic Similarity Coefficients for Binary (Presence/Absence) Data , 2009, J. Classif..

[35]  Adelin Albert,et al.  Agreement between Two Independent Groups of Raters , 2009 .

[36]  P. Mielke,et al.  A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters , 1988 .

[37]  N D Holmquist,et al.  Variability in classification of carcinoma in situ of the uterine cervix. , 1967, Archives of pathology.

[38]  Matthijs J. Warrens,et al.  A Formal Proof of a Paradox Associated with Cohen’s Kappa , 2010, J. Classif..

[39]  Art Noda,et al.  Kappa coefficients in medical research , 2002, Statistics in medicine.

[40]  Matthijs J. Warrens,et al.  On the Equivalence of Cohen’s Kappa and the Hubert-Arabie Adjusted Rand Index , 2008, J. Classif..

[41]  H. Brenner,et al.  Dependence of Weighted Kappa Coefficients on the Number of Categories , 1996, Epidemiology.

[42]  Hans Visser,et al.  The Map Comparison Kit , 2006, Environ. Model. Softw..

[43]  Roelof Popping Overeenstemmingsmaten voor nominale data , 1983 .

[44]  J. Fleiss,et al.  Measuring Agreement for Multinomial Data , 1982 .

[45]  Adelin Albert,et al.  Agreement between an isolated rater and a group of raters , 2009 .

[46]  James C. Reed Book Reviews : Visual Perceptual Abilities and Early Reading Progress by Jean Turner Goins, Supplementary Educational Monographs, #87, Chicago: University of Chicago Press, 1958, Pp. x + 108 , 1960 .

[47]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[48]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[49]  Helena C. Kraemer,et al.  TUTORIAL IN BIOSTATISTICS Kappa coecients in medical research , 2004 .

[50]  L. Hsu,et al.  Interrater Agreement Measures: Comments on Kappan, Cohen's Kappa, Scott's π, and Aickin's α , 2003 .

[51]  T. Allison,et al.  A New Procedure for Assessing Reliability of Scoring EEG Sleep Recordings , 1971 .

[52]  J. R. Landis,et al.  An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. , 1977, Biometrics.