Interobserver and intraobserver variability of RECIST assessment in ovarian cancer

Objectives Measurement of Response Evaluation Criteria In Solid Tumors (RECIST) relies on reproducible unidimensional tumor measurements. This study assessed intraobserver and interobserver variability of target lesion selection and measurement, according to RECIST version 1.1 in patients with ovarian cancer. Methods Eight international radiologists independently viewed 47 images demonstrating malignant lesions in patients with ovarian cancer and selected and measured lesions according to RECIST V.1.1 criteria. Thirteen images were viewed twice. Interobserver variability of selection and measurement were calculated for all images. Intraobserver variability of selection and measurement were calculated for images viewed twice. Lesions were classified according to their anatomical site as pulmonary, hepatic, pelvic mass, peritoneal, lymph nodal, or other. Lesion selection variability was assessed by calculating the reproducibility rate. Lesion measurement variability was assessed with the intra-class correlation coefficient. Results From 47 images, 82 distinct lesions were identified. For lesion selection, the interobserver and intraobserver reproducibility rates were high, at 0.91 and 0.93, respectively. Interobserver selection reproducibility was highest (reproducibility rate 1) for pelvic mass and other lesions. Intraobserver selection reproducibility was highest (reproducibility rate 1) for pelvic mass, hepatic, nodal, and other lesions. Selection reproducibility was lowest for peritoneal lesions (interobserver reproducibility rate 0.76 and intraobserver reproducibility rate 0.69). For lesion measurement, the overall interobserver and intraobserver intraclass correlation coefficients showed very good concordance of 0.84 and 0.94, respectively. Interobserver intraclass correlation coefficient showed very good concordance for hepatic, pulmonary, peritoneal, and other lesions, and ranged from 0.84 to 0.97, but only moderate concordance for lymph node lesions (0.58). Intraobserver intraclass correlation coefficient showed very good concordance for all lesions, ranging from 0.82 to 0.99. In total, 85% of total measurement variability resulted from interobserver measurement difference. Conclusions Our study showed that while selection and measurement concordance were high, there was significant interobserver and intraobserver variability. Most resulted from interobserver variability. Compared with other lesions, peritoneal lesions had the lowest selection reproducibility, and lymph node lesions had the lowest measurement concordance. These factors need consideration to improve response assessment, especially as progression free survival remains the most common endpoint in phase III trials.

[1]  C. Kuhl RECIST Needs Revision: A Wake-up Call for Radiologists. , 2019, Radiology.

[2]  Manish R. Sharma,et al.  Under-representation of peritoneal metastases in published clinical trials of metastatic colorectal cancer. , 2017, The Lancet. Oncology.

[3]  C. Kuhl,et al.  Target Lesion Selection: An Important Factor Causing Variability of Response Classification in the Response Evaluation Criteria for Solid Tumors 1.1 , 2014, Investigative radiology.

[4]  T. Choueiri,et al.  Intraobserver and interobserver variability in computed tomography size and attenuation measurements in patients with renal cell carcinoma receiving antiangiogenic therapy: Implications for alternative response criteria , 2014, Cancer.

[5]  H. Hricak,et al.  Intra- and interobserver variability in CT measurements in oncology. , 2013, Radiology.

[6]  Yusheng Zhu,et al.  Detection and monitoring of ovarian cancer. , 2013, Clinica chimica acta; international journal of clinical chemistry.

[7]  Ernst J. Rummeny,et al.  Intra- and inter-observer variability in measurement of target lesions: implication on response evaluation according to RECIST 1.1 , 2012, Radiology and oncology.

[8]  B. Monk,et al.  Correlation between CA-125 serum level and response by RECIST in a phase III recurrent ovarian cancer study. , 2011, Gynecologic oncology.

[9]  Corneel Coens,et al.  Early versus delayed treatment of relapsed ovarian cancer (MRC OV05/EORTC 55955): a randomised trial , 2010, The Lancet.

[10]  A. Sundin,et al.  Interobserver and intraobserver variability in the response evaluation of cancer therapy according to RECIST and WHO-criteria , 2010, Acta oncologica.

[11]  A. Gadducci,et al.  Surveillance of patients after initial treatment of ovarian cancer. , 2009, Critical reviews in oncology/hematology.

[12]  H. Ngan,et al.  The role of regular physical examination in the detection of ovarian cancer recurrence. , 2008, Gynecologic oncology.

[13]  J. Baselga,et al.  Nadir CA-125 concentration in the normal range as an independent prognostic factor for optimally treated advanced epithelial ovarian cancer. , 2008, Annals of oncology : official journal of the European Society for Medical Oncology.

[14]  T. Fehm,et al.  Evaluation of CA125, physical and radiological findings in follow-up of ovarian cancer patients. , 2005, Anticancer research.

[15]  M. Markman The myth of measurable disease in ovarian cancer. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[16]  L. Broemeling,et al.  Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[17]  G. Rustin,et al.  CA125 response: can it replace the traditional response criteria in ovarian cancer? , 2002, The oncologist.

[18]  M. van Glabbeke,et al.  New guidelines to evaluate the response to treatment in solid tumors , 2000, Journal of the National Cancer Institute.

[19]  Douglas G. Altman,et al.  Practical statistics for medical research , 1990 .

[20]  S. Lee,et al.  Diagnostic value of CA125 as a predictor of recurrence in advanced ovarian cancer. , 2013, European journal of gynaecological oncology.

[21]  Geoffrey McLennan,et al.  Assessment of radiologist performance in the detection of lung nodules: dependence on the definition of "truth". , 2009, Academic radiology.

[22]  L. Schwartz,et al.  New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). , 2009, European journal of cancer.

[23]  M Van Glabbeke,et al.  New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. , 2000, Journal of the National Cancer Institute.

[24]  A. Miller,et al.  Reporting results of cancer treatment , 1981, Cancer.