Nominal scale agreement among observers

Experiments are considered where each of a sample of subjects is assigned to one of C categories separately by each of a fixed or varying group of observers. Building on earlier publications, general procedures are proposed to analyze agreements and disagreements among observers. In the case of a varying group of observers, it is shown that it is not necessary to demand a constant number of observers per subject. In the case of a fixed group of observers, the problem of missing data is considered.The procedures are illustrated within the context of two clinical diagnosis examples. In the first example it is investigated which categories are relatively hard to distinguish from one another; a new theorem is applied that shows a useful property of the statistic kappa. In the second example it is investigated if a subgroup of observers can be found with a significantly higher degree of interobserver agreement.