Evaluating the Endoscopic Reference Score for eosinophilic esophagitis: moderate to substantial intra- and interobserver reliability

Background and study aims: Recently the Endoscopic Reference Score (EREFS) for endoscopic assessment of eosinophilic esophagitis was introduced, with good interobserver agreement for most signs. The EREFS has not yet been evaluated by other investigators and intraobserver agreement has not been assessed. The aim of this study was to further validate the EREFS by assessing interobserver and intraobserver agreement of endoscopic signs in patients with eosinophilic esophagitis. Patients and methods: High-quality endoscopic images were made of the esophagus of 30 patients with eosinophilic esophagitis (age 36 years, range 23 – 46 years; 5 female), 6 of whom were in remission. At least three depersonalized images per patient were incorporated into a slideshow. Images were scored by four expert and four trainee endoscopists who were blinded to the patients’ conditions. Interobserver agreement was assessed. After 4 weeks, the images were rescored in a different order to assess intraobserver agreement. Results: Interobserver agreement was substantial for rings (κ 0.70), white exudates (κ 0.63), and crepe paper esophagus (κ 0.62), moderate for furrows (κ 0.49) and strictures (κ 0.54), and slight for edema (κ 0.12). Intraobserver agreement was substantial for rings (median κ 0.64, IQR 0.46 – 0.70), furrows (median κ 0.69, IQR 0.50 – 0.89), and crepe paper esophagus (median κ 0.69, IQR 0.62 – 0.83), moderate for white exudates (median κ 0.58, IQR 0.54 – 0.71) and strictures (median κ 0.54, IQR 0.33 – 0.70), and less than chance for edema (median κ 0.00, IQR 0.00 – 0.29). Inter- and intraobserver agreement was not substantially different between expert and trainee endoscopists. Conclusions: Using the EREFS, endoscopic signs of eosinophilic esophagitis were scored consistently by expert and trainee endoscopists.

[1]  A. Bredenoord,et al.  Birch pollen sensitization with cross‐reactivity to food allergens predominates in adults with eosinophilic esophagitis , 2013, Allergy.

[2]  R. Fitzgerald Faculty Opinions recommendation of ACG clinical guideline: Evidenced based approach to the diagnosis and management of esophageal eosinophilia and eosinophilic esophagitis (EoE). , 2013 .

[3]  E. Dellon,et al.  Clinical Guideline : Evidenced Based Approach to the Diagnosis and Management of Esophageal Eosinophilia and Eosinophilic Esophagitis ( EoE ) , 2013 .

[4]  A. Bredenoord,et al.  Rapidly increasing incidence of eosinophilic esophagitis in a large cohort , 2013, Neurogastroenterology and motility : the official journal of the European Gastrointestinal Motility Society.

[5]  N. Shaheen,et al.  The prevalence and diagnostic utility of endoscopic features of eosinophilic esophagitis: a meta-analysis. , 2012, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[6]  M. Heckman,et al.  Endoscopic assessment of the oesophageal features of eosinophilic oesophagitis: validation of a novel classification and grading system , 2012, Gut.

[7]  A. Schoepfer,et al.  Eosinophilic esophagitis: updated consensus recommendations for children and adults. , 2011, The Journal of allergy and clinical immunology.

[8]  N. Shaheen,et al.  Variable reliability of endoscopic findings with white-light and narrow-band imaging for patients with suspected eosinophilic esophagitis. , 2011, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[9]  J. Fang,et al.  Eosinophilic oesophagitis in patients presenting with dysphagia – a prospective analysis , 2008, Alimentary pharmacology & therapeutics.

[10]  T. Smyrk,et al.  Prevalence and Predictive Factors of Eosinophilic Esophagitis in Patients Presenting With Dysphagia: A Prospective Study , 2007, The American Journal of Gastroenterology.

[11]  J. Richard Landis,et al.  Large sample variance of kappa in the case of different sets of raters. , 1979 .

[12]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[13]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[14]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[15]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.