Using Macro to simplify to Calculate Multi-Rater Observation Agreement

This paper describes using several macros program to calculate multi-rater observation agreement using the SAS ® Kappa statistic. In the paper, we show an example of four raters observed a video to select certain tasks. Each rater could select up to ten tasks. Each rater could select different tasks numbers. Inter-rater reliability (IRR) between the four raters is examined using the Kappa statistic, calculated using the SAS ® PROC FREQ, MEANS, and PRINT procedures. The Kappa statistic and 95% CI for observers were calculated and the overall IRR was calculated by averaging pairwise Kappa agreements. This paper provides an example of how using macro to calculate percentage agreement with the Kappa statistic with a 95% CI using SAS ® PROC FREQ, MEANS, and PRINT for multiple raters with multiple observation categories. The program can be used for more raters and tasks. This paper expands the current functionality of the SAS ® PROC FREQ procedure to support application of the Kappa statistic for more than two raters and several categories.