Inequalities Between Kappa and Kappa-Like Statistics for k×k Tables

The paper presents inequalities between four descriptive statistics that can be expressed in the form [P−E(P)]/[1−E(P)], where P is the observed proportion of agreement of a k×k table with identical categories, and E(P) is a function of the marginal probabilities. Scott’s π is an upper bound of Goodman and Kruskal’s λ and a lower bound of both Bennett et al. S and Cohen’s κ. We introduce a concept for the marginal probabilities of the k×k table called weak marginal symmetry. Using the rearrangement inequality, it is shown that Bennett et al. S is an upper bound of Cohen’s κ if the k×k table is weakly marginal symmetric.

[1]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[2]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[3]  J. Guilford,et al.  A Note on the G Index of Agreement , 1964 .

[4]  A. E. Maxwell Comparing the Classification of Subjects by Two Independent Judges , 1970, British Journal of Psychiatry.

[5]  J. Carlin,et al.  Bias, prevalence and kappa. , 1993, Journal of clinical epidemiology.

[6]  J. Vegelius,et al.  On Generalizations Of The G Index And The Phi Coefficient To Nominal Scales. , 1979, Multivariate behavioral research.

[7]  M. Warrens On Association Coefficients for 2×2 Tables and Properties That Do Not Depend on the Marginal Distributions , 2008, Psychometrika.

[8]  James C. Reed Book Reviews : Visual Perceptual Abilities and Early Reading Progress by Jean Turner Goins, Supplementary Educational Monographs, #87, Chicago: University of Chicago Press, 1958, Pp. x + 108 , 1960 .

[9]  Hans Visser,et al.  The Map Comparison Kit , 2006, Environ. Model. Softw..

[10]  Matthijs J. Warrens,et al.  On the Indeterminacy of Resemblance Measures for Binary (Presence/Absence) Data , 2008, J. Classif..

[11]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[12]  Rebecca Zwick,et al.  Another look at interrater agreement. , 1988, Psychological bulletin.

[13]  Klaus Krippendorff,et al.  Association, agreement, and equity , 1987 .

[14]  Dale J. Prediger,et al.  Coefficient Kappa: Some Uses, Misuses, and Alternatives , 1981 .

[15]  J. D. Mast Agreement and Kappa-Type Indices , 2007 .

[16]  Matthijs J. Warrens,et al.  On the Equivalence of Cohen’s Kappa and the Hubert-Arabie Adjusted Rand Index , 2008, J. Classif..

[17]  W. A. Scott,et al.  Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .

[18]  R. Alpert,et al.  Communications Through Limited-Response Questioning , 1954 .

[19]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[20]  A. J. Conger Integration and generalization of kappas for multiple raters. , 1980 .

[21]  Jean-Marc Constans,et al.  Fuzzy kappa for the agreement measure of fuzzy classifications , 2007, Neurocomputing.

[22]  Alan Agresti,et al.  Evaluating Agreement and Disagreement among Movie Reviewers , 1997 .

[23]  A E Maxwell,et al.  Coefficients of Agreement Between Observers and Their Interpretation , 1977, British Journal of Psychiatry.

[24]  Matthijs J. Warrens,et al.  Bounds of Resemblance Measures for Binary (Presence/Absence) Variables , 2008, J. Classif..

[25]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[26]  J. Fleiss Measuring agreement between two judges on the presence or absence of a trait. , 1975, Biometrics.

[27]  M. Warrens On Similarity Coefficients for 2×2 Tables and Correction for Chance , 2008, Psychometrika.

[28]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[29]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[30]  John J. Koval,et al.  Estimating Rater Agreement in 2 x 2 Tables: Correction for Chance and Intraclass Correlation , 1993 .