Reproducibility of histological assessments of disease activity in UC

Objective Histopathology is potentially an important outcome measure in UC. Multiple histological disease activity (HA) indices, including the Geboes score (GS) and modified Riley score (MRS), have been developed; however, the operating properties of these instruments are not clearly defined. We assessed the reproducibility of existing measures of HA. Design Five experienced pathologists with GI pathology fellowship training and expertise in IBD evaluated, on three separate occasions at least two weeks apart, 49 UC colon biopsies and scored the GS, MRS and a global rating of histological severity using a 100 mm visual analogue scale (VAS). The reproducibility of each grading system and for individual instrument items was quantified by estimates of intraclass correlation coefficients (ICCs) based on two-way random effects models. Uncertainty of estimates was quantified by 95% two-sided CIs obtained using the non-parametric cluster bootstrap method. Biopsies responsible for the greatest disagreement based on the ICC estimates were identified. A consensus process was used to determine the most common sources of measurement disagreement. Recommendations for minimising disagreement were subsequently generated. Results Intrarater ICCs (95% CIs) for the total GS, MRS and VAS scores were 0.82 (0.73 to 0.88), 0.71 (0.63 to 0.80) and 0.79 (0.72 to 0.85), respectively. Corresponding inter-rater ICCs were substantially lower: 0.56 (0.39 to 0.67), 0.48 (0.35 to 0.66) and 0.61 (0.47 to 0.72). Correlation between the GS and VAS was 0.62 and between the MRS and VAS was 0.61. Conclusions Although ‘substantial’ to ‘almost perfect’ ICCs for intrarater agreement were found in the assessment of HA in UC, ICCs for inter-rater agreement were considerably lower. According to the consensus process results, standardisation of item definitions and modification of the existing indices is required to create an optimal UC histological instrument.

[1]  M. Vandervoort,et al.  Histologic Evaluation of Ulcerative Colitis: A Systematic Review of Disease Activity Indices , 2014, Inflammatory bowel diseases.

[2]  P. Rutgeerts,et al.  OP024 Agreement among central readers in the evaluation of endoscopic disease activity in Crohn's disease , 2014 .

[3]  M. Kamm,et al.  OP023 Optimising post-operative Crohn's disease management: Best drug therapy alone versus endoscopic monitoring, disease evolution, and faecal calprotectin monitoring. The POCER study , 2014 .

[4]  P. Rutgeerts,et al.  The role of centralized reading of endoscopy in a randomized controlled trial of mesalamine for ulcerative colitis. , 2013, Gastroenterology.

[5]  G Y Zou,et al.  Sample size formulas for estimating intraclass correlation coefficients with precision and assurance , 2012, Statistics in medicine.

[6]  P. Rutgeerts,et al.  Prognostic Value of Serologic and Histologic Markers on Clinical Relapse in Ulcerative Colitis Patients With Mucosal Healing , 2012, The American Journal of Gastroenterology.

[7]  D. Rubin,et al.  Correlation Between Clinical, Endoscopic, and Histologic Disease Activity in Ulcerative Colitis: 1712 , 2012 .

[8]  P. Tugwell,et al.  Domains Selection for Patient-Reported Outcomes: Current Activities and Options for Future Methods , 2011, The Journal of Rheumatology.

[9]  S. Cucchiara,et al.  PA29 EFFECTIVENESS OF A RECTAL INFUSION OF LACTOBACILLUS REUTERI ATCC 55730 IN CHILDREN WITH DISTAL ACTIVE ULCERATIVE COLITIS , 2009 .

[10]  T. Morikawa,et al.  Interrater Reliability for Multiple Raters in Clinical Trials of Ordinal Scale , 2007 .

[11]  A. S. Hedayat,et al.  A Unified Approach for Assessing Agreement for Continuous and Categorical Data , 2007, Journal of biopharmaceutical statistics.

[12]  C. Terwee,et al.  When to use agreement versus reliability measures. , 2006, Journal of clinical epidemiology.

[13]  A. Cohen,et al.  Treatment of ulcerative colitis with a humanized antibody to the alpha4beta7 integrin. , 2005, The New England journal of medicine.

[14]  N. Harpaz,et al.  Histological grading of disease activity in chronic IBD Inter- and intra-observer variation among pathologists with different levels of experience , 2003 .

[15]  Louise Ryan,et al.  Mixed Models for Assessing Correlation in the Presence of Replication , 2003, Journal of the Air & Waste Management Association.

[16]  A. Öst,et al.  A reproducible grading scale for histological assessment of inflammation in ulcerative colitis , 2000, Gut.

[17]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[18]  Ronald Christensen,et al.  Case-deletion diagnostics for mixed models , 1992 .

[19]  S. Riley,et al.  Microscopic activity in ulcerative colitis: what does it mean? , 1991, Gut.

[20]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[21]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[22]  Mary Falzarano,et al.  Seeking consensus through the use of the Delphi technique in health sciences research. , 2013, Journal of allied health.

[23]  D. Cicchetti Guidelines, Criteria, and Rules of Thumb for Evaluating Normed and Standardized Assessment Instruments in Psychology. , 1994 .

[24]  G H Guyatt,et al.  Responsiveness and validity in health status measurement: a clarification. , 1989, Journal of clinical epidemiology.

[25]  G. Guyatt,et al.  Measuring change over time: assessing the usefulness of evaluative instruments. , 1987, Journal of chronic diseases.