Scoring inflammatory activity of the spine by magnetic resonance imaging in ankylosing spondylitis: a multireader experiment.

OBJECTIVE Magnetic resonance imaging (MRI) of the spine is increasingly important in the assessment of inflammatory activity in clinical trials with patients with ankylosing spondylitis (AS). We investigated feasibility, inter-reader reliability, sensitivity to change, and discriminatory ability of 3 different scoring methods for MRI activity and change in activity of the spine in patients with AS. METHODS Thirty sets of spinal MRI at baseline and after 24 weeks of followup, derived from a randomized clinical trial comparing a tumor necrosis factor (TNF)-blocking drug (n = 20) with placebo (n = 10) and selected to cover a wide range of activity at baseline and change in activity, were presented electronically in a partial latin-square design to 9 experienced readers from different countries (Europe, Canada). Readers scored each set of MRI 3 times, using 3 different methods including the Ankylosing Spondylitis spine Magnetic Resonance Imaging-activity [ASspiMRI-a, grading activity (0-6) per vertebral unit in 23 units]; the Berlin modification of the ASspiMRI-a; and the Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system, which scores the 6 vertebral units considered by the reader as the most abnormal, with additional scores for "depth" and "intensity." Both the order of the methods used by each reader and the timepoints (before/after treatment) were randomized. Feasibility of each scoring system was evaluated by measuring the mean time needed to score each set of MRI, and inter-reader reliability was evaluated by smallest detectable change (SDC) and by intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs separately. Sensitivity to change was investigated by calculating Guyatt's effect size on change scores. Discriminatory ability was assessed using Z-scores (Mann-Whitney test) comparing change in score between patients treated with TNF-blocking drug and placebo. RESULTS The mean time to score one set of MRI was shortest for the Berlin method. SDC was lowest for the Berlin method and highest for SPARCC. Overall inter-reader ICC per method were between 0.49 and 0.77 for scoring activity status, and between 0.46 and 0.72 for scoring activity change. ICC for all possible reader pairs showed much more fluctuation per method, with lowest observed values of about 0.05 (very low agreement) and highest observed values over 0.90 (excellent agreement). In general, ICC for SPARCC were consistently higher than for other systems. Sensitivity to change differed per reader, and was more consistent with SPARCC than with the other methods, but was in general excellent for all 3 methods. Discrimination between groups (TNF-blocker vs placebo) assessed by Z-scores was good and comparable among methods. CONCLUSION This experiment demonstrates the feasibility of multiple-reader MRI scoring exercises for method comparison, provides evidence for the feasibility, reliability, sensitivity to change, and discriminatory capacity of all 3 tested scoring systems to be used in assessing spinal activity on MRI in patients with AS in clinical trials. On the basis of these results it is not possible to prioritize one of the 3 methods.

[1]  R. Landewé,et al.  Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment. , 2005, The Journal of rheumatology.

[2]  David Salonen,et al.  Spondyloarthritis Research Consortium of Canada Magnetic Resonance Imaging Index For Assessment of Sacroiliac Joint Inflammation in Ankylosing Spondylitis , 2005 .

[3]  Neal Stewart,et al.  OMERACT Rheumatoid Arthritis Magnetic Resonance Imaging Studies. Exercise 3: an international multicenter reliability study using the RA-MRI Score. , 2003, The Journal of rheumatology.

[4]  David Salonen,et al.  Spondyloarthritis Research Consortium of Canada magnetic resonance imaging index for assessment of spinal inflammation in ankylosing spondylitis. , 2005, Arthritis and rheumatism.

[5]  D. M. van der Heijde,et al.  Application of the OMERACT filter to scoring methods for magnetic resonance imaging of the sacroiliac joints and the spine. Recommendations for a research agenda at OMERACT 7. , 2005, The Journal of rheumatology.

[6]  D. M. van der Heijde,et al.  Magnetic resonance imaging examinations of the spine in patients with ankylosing spondylitis, before and after successful therapy with infliximab: evaluation of a new scoring system. , 2003, Arthritis and rheumatism.

[7]  D. M. van der Heijde,et al.  Magnetic resonance imaging of inflammatory lesions in the spine in ankylosing spondylitis clinical trials: is paramagnetic contrast medium necessary? , 2005, The Journal of rheumatology.

[8]  G. Guyatt,et al.  Measuring change over time: assessing the usefulness of evaluative instruments. , 1987, Journal of chronic diseases.

[9]  D. M. van der Heijde,et al.  Analysing chronic spinal changes in ankylosing spondylitis: a systematic comparison of conventional x rays with magnetic resonance imaging using established and new scoring systems , 2004, Annals of the rheumatic diseases.

[10]  D. M. van der Heijde,et al.  Assessment of acute spinal inflammation in patients with ankylosing spondylitis by magnetic resonance imaging ( MRI ) – a comparison between contrast enhanced T 1 and short-tau inversion recovery ( STIR ) sequences , 2005 .