Confidence intervals for the interrater agreement measure kappa

The asympotic normal approximation to the distribution of the estimated measure [kcirc] for evaluating agreement between two raters has been shown to perform poorly for small sample sizes when the true kappa is nonzero. This paper examines the use of skewness corrections and transformations of [kcirc] on the attained confidence levels. Small sample simulations demonstrate the improvement in the agreement between the desired and actual levels of confidence intervals and hypothesis tests that incorporate these corrections.