Observer Variation in Interpreting 18F-FDG PET/CT Findings for Lymphoma Staging

Many studies demonstrate a high accuracy for PET in staging lymphoma, but few assess observer variation. This study quantified agreement for staging lymphoma with PET/CT. Methods: The PET/CT images of 100 patients with lymphoma who had been referred for staging were reviewed by 3 experienced observers, with 2 observers reviewing each series a second time. Ann Arbor stage and individual nodal and extranodal regions were assessed. Weighted κ (κw) and intraclass correlation coefficient were used to compare ratings. Results: Intra- and interobserver agreement was high for Ann Arbor stage (κw = 0.79–0.91), number of nodal regions involved (intraclass correlation coefficient, 0.83–0.93), and presence of extranodal disease (κ = 0.74–0.86). High agreement was also observed for all nodal regions (κw > 0.60) except hilar (κw = 0.56–0.82) and infraclavicular (κw = 0.14–0.55). Lower agreement was observed for bowel involvement (κw = 0.37–0.71). Conclusion: Experienced observers had a high level of agreement using PET/CT for lymphoma staging, supporting its use as a robust noninvasive staging tool. Further research is needed to evaluate observer variability for restaging during and after chemotherapy.

[1]  C. Schiepers,et al.  Molecular imaging in oncology: the acceptance of PET/CT and the emergence of MR/PET imaging , 2010, European Radiology.

[2]  P. Feustel,et al.  Bilateral Hilar Foci on 18F-FDG PET Scan in Patients Without Lung Cancer: Variables Associated with Benign and Malignant Etiology , 2008, Journal of Nuclear Medicine.

[3]  N. Smeeton,et al.  Observer variation in FDG PET-CT for staging of non-small-cell lung carcinoma , 2008, European Journal of Nuclear Medicine and Molecular Imaging.

[4]  C. Gundy,et al.  FDG PET in lymphoma: The need for standardization of interpretation. An observer variation study , 2007, Nuclear medicine communications.

[5]  A. Kirby,et al.  The role of FDG PET in the management of lymphoma: what is the evidence base? , 2007, Nuclear medicine communications.

[6]  Klemens Scheidhauer,et al.  Use of positron emission tomography for response assessment of lymphoma: consensus of the Imaging Subcommittee of International Harmonization Project in Lymphoma. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[7]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[8]  C. Compton,et al.  AJCC Cancer Staging Manual , 2002, Springer New York.

[9]  P. Gieser,et al.  Interobserver variability in the detection of cervical-thoracic Hodgkin's disease by computed tomography. , 1999, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[10]  P. Robinson,et al.  Radiology's Achilles' heel: error and variation in the interpretation of the Röntgen image. , 1997, The British journal of radiology.

[11]  Graham Dunn,et al.  Clinical Biostatistics: An Introduction to Evidence-Based Medicine , 1995 .

[12]  S D Walter,et al.  A reappraisal of the kappa coefficient. , 1988, Journal of clinical epidemiology.

[13]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[14]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[15]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .