A Comparison of Methods for the Evaluation of Binary Measurement Systems

Many quality programs prescribe a measurement system analysis (MSA) to be performed on the key quality characteristics. This guarantees the reliability of the acquired data, which serve as the basis for drawing conclusions with respect to the behavior of the key quality characteristics. When dealing with continuous characteristics, the Gauge R&R is regarded as the statistical technique in MSA. For binary characteristics, no such universally accepted equivalent is available. We discuss methods that could serve as an MSA for binary data. We argue that a latent class model is the most promising candidate.

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  Russell A. Boyles,et al.  Gauge Capability for Pass—Fail Inspection , 2001, Technometrics.

[3]  Dale J. Prediger,et al.  Coefficient Kappa: Some Uses, Misuses, and Alternatives , 1981 .

[4]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[5]  J J Bartko,et al.  ON THE METHODS AND THEORY OF RELIABILITY , 1976, The Journal of nervous and mental disease.

[6]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[7]  A. J. Conger Integration and generalization of kappas for multiple raters. , 1980 .

[8]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[9]  Graham Dunn,et al.  Review papers : Design and analysis of reliability studies , 1992 .

[10]  F. M. Andrews,et al.  Evaluating the Measures of Well-Being , 1976 .

[11]  G. Dunn,et al.  Design and analysis of reliability studies. , 1992, Statistical methods in medical research.

[12]  J. Michael Brick Evaluating the Measurement Process , 1989 .

[13]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[14]  G.,et al.  A review of statistical methods in the analysis of data arising from observer reliability studies (Part II)* , 2007 .

[15]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[16]  J. Fleiss Estimating the accuracy of dichotomous judgments , 1965, Psychometrika.

[17]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[18]  J. Uebersax Validity inferences from interobserver agreement. , 1988 .

[19]  Brian Everitt,et al.  MOMENTS OF THE STATISTICS KAPPA AND WEIGHTED KAPPA , 1968 .

[20]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[21]  A. Feinstein,et al.  High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.

[22]  Martin A. Tanner,et al.  Modeling Agreement among Raters , 1985 .