Measurement reliability and agreement in psychiatry

Psychiatric research has benefited from attention to measurement theories of reliability, and reliability/agreement statistics for psychopathology ratings and diagnoses are regularly reported in empirical reports. Nevertheless, there are still controversies regarding how reliability should be measured, and the amount of resources that should be spent on studying measurement quality in research programs. These issues are discussed in the context of recent theoretical and technical contributions to the statistical analysis of reliability. Special attention is paid to statistical studies published since Kraemer's 1992 review of reliability methods in this journal.

[1]  A Donner,et al.  Sample size requirements for the comparison of two or more coefficients of inter-observer agreement. , 1998, Statistics in medicine.

[2]  S. Walter,et al.  Sample size and optimal designs for reliability studies. , 1998, Statistics in medicine.

[3]  T Raykov,et al.  Scale Reliability, Cronbach's Coefficient Alpha, and Violations of Essential Tau-Equivalence with Fixed Congeneric Components. , 1997, Multivariate behavioral research.

[4]  Michael K. Lindell,et al.  Measuring Interrater Agreement for Ratings of a Single Target , 1997 .

[5]  T. Raykov Estimation of Composite Reliability for Congeneric Measures , 1997 .

[6]  A hierarchical approach to inferences concerning interobserver agreement for multinomial data. , 1997, Statistics in medicine.

[7]  A R Hakstian,et al.  The Robustness of Confidence Intervals for Coefficient Alpha Under Violation of the Assumption of Essential Parallelism. , 1997, Multivariate behavioral research.

[8]  C. Nickerson A note on a concordance correlation coefficient to evaluate reproducibility , 1997 .

[9]  K. McGraw,et al.  "Forming inferences about some intraclass correlations coefficients": Correction. , 1996 .

[10]  J. Morgenstern,et al.  Measuring diagnostic agreement. , 1996, Journal of consulting and clinical psychology.

[11]  Gideon J. Mellenbergh,et al.  Measurement precision in test score and item response models , 1996 .

[12]  M Aickin,et al.  Analysis of multivariate reliability structures and the induced bias in linear model estimation. , 1996, Statistics in medicine.

[13]  W. Barlow Measurement of interrater agreement with adjustment for covariates. , 1996, Biometrics.

[14]  A. Cantor,et al.  Sample-size calculations for Cohen's kappa. , 1996 .

[15]  Donald B. Rubin,et al.  Reliability of measurement in psychology: From Spearman-Brown to maximal reliability. , 1996 .

[16]  M. Eliasziw,et al.  Testing the homogeneity of kappa statistics. , 1996, Biometrics.

[17]  K. McGraw,et al.  Forming inferences about some intraclass correlation coefficients. , 1996 .

[18]  L. Chambless,et al.  Effects of model misspecification in the estimation of variance components and intraclass correlation for paired data. , 1995, Statistics in medicine.

[19]  P. Graham Modelling covariate effects in observer agreement studies: the case of nominal scale agreement. , 1995, Statistics in medicine.

[20]  Michael Miller Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling , 1995 .

[21]  An ordinal coefficient of relational agreement for multiple judges , 1994 .

[22]  D. Commenges,et al.  The intraclass correlation coefficient: distribution-free definition and test. , 1994, Biometrics.

[23]  A Donner,et al.  Statistical implications of the choice between a dichotomous or continuous trait in studies of interobserver agreement. , 1994, Biometrics.

[24]  J J Bartko,et al.  Measures of agreement: a single procedure. , 1994, Statistics in medicine.

[25]  C A Bodian,et al.  Intraclass correlation for two-by-two tables under three sampling designs. , 1994, Biometrics.

[26]  I. Guggenmoos‐Holzmann,et al.  How reliable are chance-corrected measures of agreement? , 1993, Statistics in medicine.

[27]  John J. Koval,et al.  Estimating Rater Agreement in 2 x 2 Tables: Correction for Chance and Intraclass Correlation , 1993 .

[28]  J. Fleiss,et al.  Interval estimation under two study designs for kappa with binary classifications. , 1993, Biometrics.

[29]  L. James,et al.  rwg: An assessment of within-group interrater agreement. , 1993 .

[30]  H. Kraemer,et al.  Measurement of reliability for categorical data in medical research , 1992, Statistical methods in medical research.

[31]  S. Kozlowski,et al.  A disagreement about within-group agreement: Disentangling issues of consistency versus consensus. , 1992 .

[32]  A Donner,et al.  A goodness-of-fit approach to inference procedures for the kappa statistic: confidence interval construction, significance-testing and sample size estimation. , 1992, Statistics in medicine.

[33]  Sadanori Konishi,et al.  Inferences on Multivariate Measures of Interclass and Intraclass Correlations in Familial Data , 1991 .

[34]  K. O’grady,et al.  Rater Reliability: A Maximum Likelihood Confirmatory Factor-Analytic Approach. , 1991, Multivariate behavioral research.

[35]  R. F. Fagot Reliability of Ratings for Multiple Judges: Intraclass Correlation and Metric Scales , 1991 .

[36]  D. Altman,et al.  A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. , 1990, Computers in biology and medicine.

[37]  J. Fleiss,et al.  Reliability considerations in planning diagnostic validity studies , 1989 .

[38]  John E. Hunter,et al.  Interrater reliability coefficients cannot be computed when only one stimulus is rated. , 1989 .

[39]  H. Kraemer,et al.  2 x 2 kappa coefficients: measures of agreement or association. , 1989, Biometrics.

[40]  T. Mak Analysing Intraclass Correlation for Dichotomous Variables , 1988 .

[41]  H. Kraemer,et al.  Kappa coefficients in epidemiology: an appraisal of a reappraisal. , 1988, Journal of clinical epidemiology.

[42]  J. Fleiss,et al.  Quantification of agreement in psychiatric diagnosis revisited. , 1987, Archives of general psychiatry.

[43]  V. Flack Confidence intervals for the interrater agreement measure kappa , 1987 .

[44]  E. Spitznagel,et al.  A proposed solution to the base rate problem in the kappa statistic. , 1985, Archives of general psychiatry.

[45]  Martin A. Tanner,et al.  Modeling Agreement among Raters , 1985 .

[46]  L. James,et al.  Estimating within-group interrater reliability with and without response bias. , 1984 .

[47]  R. Spitzer,et al.  Psychiatric diagnosis: are clinicians still necessary? , 1983, Comprehensive psychiatry.

[48]  Jacob Cohen The Cost of Dichotomization , 1983 .

[49]  N. Andreasen,et al.  Reliability studies of psychiatric diagnosis. Theory and practice. , 1981, Archives of general psychiatry.

[50]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[51]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[52]  I. Gottesman,et al.  Reliability and validity in binary ratings: areas of common misunderstanding in diagnosis and symptom ratings. , 1978, Archives of general psychiatry.

[53]  J. Fleiss,et al.  Approximate interval estimation for a certain intraclass correlation coefficient , 1978 .

[54]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[55]  A E Maxwell,et al.  Coefficients of Agreement Between Observers and Their Interpretation , 1977, British Journal of Psychiatry.

[56]  J J Bartko,et al.  ON THE METHODS AND THEORY OF RELIABILITY , 1976, The Journal of nervous and mental disease.

[57]  Joseph L. Fleiss,et al.  Estimating the reliability of interview data , 1970 .

[58]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[59]  W. G. Cochran Errors of Measurement in Statistics , 1968 .

[60]  O. L. Davies,et al.  Statistical Methods. 6th Edition. , 1968 .

[61]  J. Bartko The Intraclass Correlation Coefficient as a Measure of Reliability , 1966, Psychological reports.

[62]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[63]  Jerome Cornfield,et al.  A Statistical Problem Arising from Retrospective Studies , 1956 .

[64]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[65]  C. Spearman CORRELATION CALCULATED FROM FAULTY DATA , 1910 .

[66]  W. Brown SOME EXPERIMENTAL RESULTS IN THE CORRELATION OF MENTAL ABILITIES1 , 1910 .