论文信息 - A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters - 字舞流文

A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters

Cohen's kappa statistic is frequently used to measure agreement between two observers employing categorical polytomies. In this paper, Cohen's statistic is shown to be inherently multivariate in nature; it is expanded to analyze ordinal and interval data; and it is extended to more than two observers. A nonasymptotic test of significance is provided for the generalized statistic.

P. Mielke | K. J. Berry

[1] C. Spearman. ‘FOOTRULE’ FOR MEASURING CORRELATION , 1906 .

[2] M. Friedman. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[3] Q. Mcnemar. Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[4] M. Kendall. Rank Correlation Methods , 1949 .

[5] W. G. Cochran. The comparison of percentages in matched samples. , 1950, Biometrika.

[6] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .

[7] P. Armitage,et al. The Measurement of Observer Disagreement in the Recording of Signs , 1966 .

[8] J. Bartko. The Intraclass Correlation Coefficient as a Measure of Reliability , 1966, Psychological reports.

[9] K. Krippendorff. Bivariate Agreement Coefficients for Reliability of Data , 1970 .

[10] Klaus Krippendorff,et al. Estimating the Reliability, Systematic Error and Random Error of Interval Data , 1970 .

[11] R. Light. Measures of response agreement for qualitative data: Some generalizations and alternatives. , 1971 .

[12] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .

[13] Jacob Cohen,et al. The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[14] J J Bartko,et al. ON THE METHODS AND THEORY OF RELIABILITY , 1976, The Journal of nervous and mental disease.

[15] J. Bartko,et al. On Various Intraclass Correlation Reliability Coefficients , 1976 .

[16] G. W. Williams,et al. Comparing the joint agreement of several raters with another rater. , 1976, Biometrics.

[17] J. R. Landis,et al. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. , 1977, Biometrics.

[18] L. A. Goodman,et al. Measures of association for cross classifications , 1979 .

[19] A. J. Conger. Integration and generalization of kappas for multiple raters. , 1980 .

[20] Dale J. Prediger,et al. Coefficient Kappa: Some Uses, Misuses, and Alternatives , 1981 .

[21] Kenneth J. Berry,et al. Application of Multi-Response Permutation Procedures for Examining Seasonal Changes in Monthly Mean Sea-Level Pressure Patterns , 1981 .

[22] H. Iyer,et al. Permutation techniques for analyzing multi-response data from randomized block experiments , 1982 .

[23] Rater agreement for complex assessments , 1983 .

[24] Paul W. Mielke,et al. 34 Meteorological applications of permutation techniques based on distance functions , 1984, Nonparametric Methods.

[25] Measures of Agreement for Incompletely Ranked Data , 1984 .

[26] A. J. Conger. Kappa Reliabilities for Continuous Behaviors and Events , 1985 .

[27] Peter Tyrer,et al. The Effect of Number of Rating Scale Categories on Levels of Interrater Reliability : A Monte Carlo Investigation , 1985 .

[28] P. Mielke. Non-metric statistical analyses: Some metric alternatives☆ , 1986 .

[29] C. Spearman. The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[30] F9. L1, L2 and L∞ regression models: Is there a difference? , 1987 .