Interrater reliability of computer-assisted scoring of breathing during sleep.

Twenty-six records of sleep and breathing obtained with a portable monitoring system from elderly subjects were scored by three raters with computer assistance to examine interrater reliability of scoring. Raters were a medical student, a nurse practitioner, and a family physician, all of whom had at least one month's experience with the equipment. Agreement among raters was measured with the unweighted kappa statistic. Significant agreement was observed for all variables, although agreement was better for variables describing breathing (range of kappa 0.71-0.87) than for those describing sleep (range of kappa 0.34-0.57). Complete agreement among the three raters on diagnostic classification occurred in 17 cases. In the remaining 9 cases, 2 raters agreed, whereas the third differed by not more than one category for type of disturbance (e.g., normal versus hypopnea, hypopnea versus apnea) or severity (e.g., mild versus moderate). There was only one disagreement among raters for the 9 subjects with severe respiratory disturbance. We conclude that interrater reliability of identifying and characterizing breathing disturbance during sleep as recorded by portable monitoring is high among trained raters using computer assistance.