Measurement of interobserver agreement using a "standard": measure formulation and statistical inferences

This paper is concerned with the measurement of agreement between two observers who independently classify items or observations into a set of given categories. In particular, the proposed agreement measure applies to the situation when one of the observers is viewed as the "standard" against which the ratings of the other observer are compared. The possibility that different types of disagreements should be weighted differently is incorporated into the measure, and statistical inference procedures are outlined.