Agreement in computer-assisted manual scoring of polysomnograms across sleep centers.

STUDY OBJECTIVES To determine intersite agreement in respiratory event scoring of polysomnograms (PSGs) using different hypopnea definitions. DESIGN Technical assessment. SETTING Five academic medical centers. PARTICIPANTS N/A. INTERVENTIONS N/A. MEASUREMENTS AND RESULTS Seventy good-quality PSGs performed in middle-aged women were manually scored by two experienced technologists at each of the five sleep centers using the particular laboratory's own software system. Studies were scored once by each scorer using American Academy of Sleep Medicine (AASM) standards for scoring sleep stages, arousals, and apneas. Hypopneas were then scored using three different AASM criteria: recommended, alternate, and research (Chicago). Means of each PSG variable for the scorers at each site were used to calculate an across-site intraclass correlation coefficient (ICC). Average AHI across the 10 scorers was 7.4 ± 12.3 (standard deviation) events/h using recommended criteria (ICC 0.984; 95% confidence interval [CI] 0.977-0.990), 12.1 ± 13.3 events/h using alternate criteria (ICC 0.947; 95% CI 0.889-0.972), and 15.1 ± 13.9 events/h with Chicago criteria (ICC 0.800; 95% CI 0.768-0.828). ICC across sites was 0.870 (95% CI = 0.847-0.889) for total sleep time, 0.861 (95% CI 0.837-0.881) for number of obstructive apneas and 0.683 (95% CI 0.640-0.722) for number of central apneas. ICCs across sites for hypopneas were very good using recommended criteria (ICC 0.843; 95% CI 0.820-0.870) but decreased when alternate criteria (ICC 0.728; 95% CI 0.689-0.763) and Chicago criteria (ICC 0.535; 95% CI 0.485-0.583) were used. CONCLUSION Experienced scorers at different laboratories have very good agreement in hypopnea and AHI results when good-quality PSGs are scored using AASM-recommended criteria. Substantial degradation of reliability was observed for alternative definitions of hypopneas, particularly that proposed for research.

[1]  S. Ancoli-Israel,et al.  Night-to-night arousal variability and interscorer reliability of arousal measurements. , 1999, Sleep.

[2]  D. Rapoport,et al.  Choice of oximeter affects apnea-hypopnea index. , 2005, Chest.

[3]  A Värri,et al.  A simple format for exchange of digitized polygraphic recordings. , 1992, Electroencephalography and clinical neurophysiology.

[4]  Daniel J Buysse,et al.  Sleep-related breathing disorders in adults: recommendations for syndrome definition and measurement techniques in clinical research. The Report of an American Academy of Sleep Medicine Task Force. , 1999, Sleep.

[5]  Kazuhiko Fukuda,et al.  Proposed supplements and amendments to ‘A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects’, the Rechtschaffen & Kales (1968) standard , 2001, Psychiatry and clinical neurosciences.

[6]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[7]  S J Barker,et al.  Pulse oximeter performance during desaturation and resaturation: a comparison of seven models. , 1997, Journal of clinical anesthesia.

[8]  N. Collop Scoring variability between polysomnography technologists in different sleep laboratories. , 2002, Sleep medicine.

[9]  David Kelley,et al.  Obstructive Sleep Apnea Among Obese Patients With Type 2 Diabetes , 2009, Diabetes Care.

[10]  P. L. Smith,et al.  Effects of varying approaches for identifying respiratory disturbances on sleep apnea assessment. , 2000, American journal of respiratory and critical care medicine.

[11]  David A. Schulman,et al.  Evaluation of sham-CPAP as a placebo in CPAP intervention studies. , 2010, Sleep.

[12]  A. Rechtschaffen,et al.  A manual of standardized terminology, technique and scoring system for sleep stages of human subjects , 1968 .

[13]  S. Redline,et al.  Reliability of scoring respiratory disturbance indices and sleep staging. , 1998, Sleep.

[14]  Bronwyn Stevens,et al.  The 2007 AASM recommendations for EEG electrode placement in polysomnography: impact on sleep and cortical arousal scoring. , 2011, Sleep.

[15]  Daniel J Buysse,et al.  Sleep–Related Breathing Disorders in Adults: Recommendations for Syndrome Definition and Measurement Techniques in Clinical Research , 2000 .

[16]  A. Pack,et al.  Performance of an automated polysomnography scoring system versus computer-assisted manual scoring. , 2013, Sleep.

[17]  Bonnie K. Lind,et al.  Methods for obtaining and analyzing unattended polysomnography data for a multicenter study. Sleep Heart Health Research Group. , 1998, Sleep.

[18]  P. Anderer,et al.  Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard , 2009, Journal of sleep research.

[19]  I. Gurubhagavatula,et al.  Noninferiority of functional outcome in ambulatory management of obstructive sleep apnea. , 2011, American journal of respiratory and critical care medicine.

[20]  R Ferri,et al.  Comparison between the results of an automatic and a visual scoring of sleep EEG recordings. , 1989, Sleep.

[21]  C J Griffiths,et al.  Interobserver variability in recognizing arousal in respiratory sleep disorders. , 1998, American journal of respiratory and critical care medicine.

[22]  Parmjit Singh,et al.  The new AASM criteria for scoring hypopneas: impact on the apnea hypopnea index. , 2009, Sleep.

[23]  Thomas Penzel,et al.  Agreement in the scoring of respiratory events and sleep among international sleep centers. , 2013, Sleep.

[24]  D. Rapoport,et al.  Interobserver agreement among sleep scorers from different centers in a large dataset. , 2000, Sleep.

[25]  S. Chokroverty,et al.  The visual scoring of sleep in adults. , 2007, Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine.