Measurement of reliability for categorical data in medical research

The problem of measuring reliability of categorical measurements, particularly diagnostic categorizations, is addressed. The approach is based on classical measurement theory and requires interpretability of the reliability coefficients in terms of loss of precision in estimation or power in statistical tests. A general model is proposed, leading to definition of reliability indices. Design and estimation approaches are discussed. Issues and approaches found in the research literature that either lead to confusing or misleading results are presented. The signs and symptoms of unreliable diagnoses are identified, and strategies for improving the reliability of such diagnoses are discussed.

[1]  M. Tanner,et al.  Modeling ordinal scale disagreement. , 1985, Psychological bulletin.

[2]  T. Cacoullos A Relation between t and F-distributions , 1965 .

[3]  J. Bartko,et al.  On Various Intraclass Correlation Reliability Coefficients , 1976 .

[4]  J. Bartko Corrective Note to: “The Intraclass Correlation Coefficient as a Measure of Reliability” , 1974 .

[5]  Alan Agresti,et al.  Mathematical and computer modelling reports: A model for agreement between ratings on an ordinal scale , 1988 .

[6]  L. A. Goodman,et al.  Measures of Association for Cross Classifications III: Approximate Sampling Theory , 1963 .

[7]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[8]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[9]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[10]  L. A. Goodman,et al.  Measures of Association for Cross Classifications. II: Further Discussion and References , 1959 .

[11]  H. Kraemer,et al.  Kappa coefficients in epidemiology: an appraisal of a reappraisal. , 1988, Journal of clinical epidemiology.

[12]  H. Kraemer,et al.  2 x 2 kappa coefficients: measures of agreement or association. , 1989, Biometrics.

[13]  J. Bartko The Intraclass Correlation Coefficient as a Measure of Reliability , 1966, Psychological reports.

[14]  R. Downey,et al.  Intraclass Correlations: There's More There Than Meets the Eye , 1983 .

[15]  James Algina,et al.  Comment on Bartko's "On Various Intraclass Correlation Reliability Coefficients" , 1978 .

[16]  L. A. Goodman,et al.  Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances , 1972 .

[17]  M P Becker,et al.  Using association models to analyse agreement data: two examples. , 1989, Statistics in medicine.

[18]  H. Kraemer On estimation and hypothesis testing problems for correlation coefficients , 1975 .

[19]  Helena C. Kraemer,et al.  Assessment of 2 × 2 Associations: Generalization of Signal-Detection Methodology , 1988 .

[20]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .