Computing inter-rater reliability and its variance in the presence of high agreement.
暂无分享,去创建一个
[1] M. H. Quenouille. Approximate Tests of Correlation in Time‐Series , 1949 .
[2] W. A. Scott,et al. Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .
[3] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[4] J. Guilford,et al. A Note on the G Index of Agreement , 1964 .
[5] Jacob Cohen,et al. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .
[6] B. Everitt,et al. Large sample standard errors of kappa and weighted kappa. , 1969 .
[7] R. Light. Measures of response agreement for qualitative data: Some generalizations and alternatives. , 1971 .
[8] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .
[9] P. Holland,et al. Discrete Multivariate Analysis. , 1976 .
[10] J. R. Landis,et al. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. , 1977, Biometrics.
[11] H. Kraemer. Ramifications of a population model forκ as a coefficient of reliability , 1979 .
[12] A. J. Conger. Integration and generalization of kappas for multiple raters. , 1980 .
[13] A. Feinstein,et al. High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.
[14] M. Banerjee,et al. Beyond kappa: A review of interrater agreement measures , 1999 .
[15] A. Winsor. Sampling techniques. , 2000, Nursing times.