Interval estimation for Cohen's kappa as a measure of agreement.

Cohen's kappa statistic is a very well known measure of agreement between two raters with respect to a dichotomous outcome. Several expressions for its asymptotic variance have been derived and the normal approximation to its distribution has been used to construct confidence intervals. However, information on the accuracy of these normal-approximation confidence intervals is not comprehensive. Under the common correlation model for dichotomous data, we evaluate 95 per cent lower confidence bounds constructed using four asymptotic variance expressions. Exact computation, rather than simulation is employed. Specific conditions under which the use of asymptotic variance formulae is reasonable are determined.

[1]  H. Kraemer,et al.  Extension of the kappa coefficient. , 1980, Biometrics.

[2]  J. Fleiss Measuring agreement between two judges on the presence or absence of a trait. , 1975, Biometrics.

[3]  G.,et al.  A review of statistical methods in the analysis of data arising from observer reliability studies (Part II)* , 2007 .

[4]  A. Basu,et al.  Comparison of several goodness-of-fit tests for the kappa statistic based on exact power and coverage probability. , 1995, Statistics in medicine.

[5]  Bernard R. Rosner,et al.  Fundamentals of Biostatistics. , 1992 .

[6]  J. Fleiss,et al.  Inference About Weighted Kappa in the Non-Null Case , 1978 .

[7]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[8]  Rupert G. Miller A Trustworthy Jackknife , 1964 .

[9]  John J. Koval,et al.  Estimating Rater Agreement in 2 x 2 Tables: Correction for Chance and Intraclass Correlation , 1993 .

[10]  J. B. Garner,et al.  The standard error of Cohen's Kappa. , 1991, Statistics in medicine.

[11]  Jerome Cornfield,et al.  A Statistical Problem Arising from Retrospective Studies , 1956 .

[12]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[13]  H. Kraemer,et al.  2 x 2 kappa coefficients: measures of agreement or association. , 1989, Biometrics.

[14]  V. Flack Confidence intervals for the interrater agreement measure kappa , 1987 .

[15]  A Donner,et al.  A goodness-of-fit approach to inference procedures for the kappa statistic: confidence interval construction, significance-testing and sample size estimation. , 1992, Statistics in medicine.

[16]  Royston Jp,et al.  An objective method for detecting the shift in basal body temperature in women. , 1980 .

[17]  R J Prineas,et al.  A new epidemiologic classification system for interim myocardial infarction from serial electrocardiographic changes. , 1989, The American journal of cardiology.

[18]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[19]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[20]  Rebecca Zwick,et al.  Another look at interrater agreement. , 1988, Psychological bulletin.

[21]  M. J. Norušis,et al.  SPSS/PC+ base system user's guide, version 5.0 , 1992 .

[22]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[23]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[24]  J. Koval,et al.  Estimators of kappa-exact small sample properties , 1996 .

[25]  J. Fleiss,et al.  Jackknifing functions of multinomial frequencies, with an application to a measure of concordance. , 1982, American journal of epidemiology.

[26]  H. Kraemer,et al.  Kappa coefficients in epidemiology: an appraisal of a reappraisal. , 1988, Journal of clinical epidemiology.

[27]  W. A. Scott,et al.  Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .

[28]  J. Fleiss,et al.  Interval estimation under two study designs for kappa with binary classifications. , 1993, Biometrics.

[29]  Jiun-Kae Jack Lee,et al.  A Better Confidence Interval for Kappa (κ) on Measuring Agreement between Two Raters with Binary Outcomes , 1994 .

[30]  E. Rogot,et al.  A proposed index for measuring agreement in test-retest studies. , 1966, Journal of chronic diseases.

[31]  Jon Udell Turbo Pascal for Windows , 1992 .

[32]  T. Mak Analysing Intraclass Correlation for Dichotomous Variables , 1988 .