Assessment of inter-rater agreement on the grading of intravascular bubble signals.

Transcutaneous Doppler ultrasonic bubble detectors are widely used in decompression research. However, interpretation of the complex acoustic signals from the bubble detectors involves a degree of subjectivity, and the comparability of grades assigned by different raters must be assessed. Hypothetical data were used to determine an appropriate method for evaluating the comparability of Doppler raters and to illustrate the limitations of many nonparametric statistics. Two sets of real data were then used to evaluate this procedure, the first from a training exercise carried out by Kisman and Masurel (1978, unpublished) and the second from a test tape that was independently scored by five Defence and Civil Institute of Environmental Medicine Doppler technicians. The results were analyzed by a two-stage approach. First, they were entered into contingency tables and checked for large disagreements, a tendency for one rater to grade higher than the other, and the degree of variability. Second, the results were analyzed with the nonparametric weighted kappa statistic. These studies have led to a practical, efficient method for the evaluation of Doppler raters.