Use and misuse of waterfall plots.

BACKGROUND "Waterfall plots" are used to describe changes in tumor size observed in clinical studies. Here we assess criteria for generation of waterfall plots and the impact of measurement error in generating them. METHODS We reviewed published waterfall plots to investigate variability in criteria used to define them. We then compared waterfall plots generated by different observers for 24 patients enrolled in a completed phase I study of solid tumors with available computed tomography (CT) scans. Tumor measurements were made independently from CT scans according to Response Evaluation Criteria in Solid Tumors 1.1 by four board-certified radiologists and four medical oncologists. Interobserver variability was quantified and compared with reference measurements reported for the phase 1 study. All statistical tests were two-sided. RESULTS There was substantial variability in criteria used to generate published waterfall plots. In the internal study, the results were statistically significantly different between all eight readers (P = .01, variance = 197.1, SD = 14.0) and between the oncologists (P = .01, variance = 319.0, SD = 17.9), but not between the radiologists (P = .68, variance = 70.8, SD = 8.4). Different observers classified one to five patients as having a partial response and 12-19 patients as having stable disease. Similar variability in categorization of response was observed when these error rates were applied to published waterfall plots. CONCLUSION Waterfall plots are subject to substantial variability in criteria used to define them and are influenced by measurement errors; they should be generated by trained radiologists. Caution should be exercised when interpreting results of waterfall plots in the context of clinical trials.

[1]  Eric P Tamm,et al.  CT evaluation of the response of gastrointestinal stromal tumors after imatinib mesylate treatment: a quantitative analysis correlated with FDG PET findings. , 2004, AJR. American journal of roentgenology.

[2]  Jeffrey W. Clark,et al.  Prospective study of bevacizumab plus temozolomide in patients with advanced neuroendocrine tumors. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  J. Hanley,et al.  The effect of measuring error on the results of therapeutic trials in advanced cancer , 1976, Cancer.

[4]  L. Tanoue,et al.  Variability of Lung Tumor Measurements on Repeat Computed Tomography Scans Taken Within 15 Minutes , 2012 .

[5]  Y. Rolland,et al.  Comparison of tumor response by Response Evaluation Criteria in Solid Tumors (RECIST) and modified RECIST in patients treated with sorafenib for hepatocellular carcinoma , 2012, Cancer.

[6]  Heber MacMahon,et al.  Variability in mesothelioma tumor response classification. , 2006, AJR. American journal of roentgenology.

[7]  I. Tannock,et al.  Influence of measurement error on response rates. , 1985, Cancer treatment reports.

[8]  G. Rosner,et al.  Phase II placebo-controlled randomized discontinuation trial of sorafenib in patients with metastatic renal cell carcinoma. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[9]  I. Tannock,et al.  Influence of measurement error on assessment of response to anticancer chemotherapy: proposal for new criteria of tumor response. , 1984, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[10]  E. Plimack,et al.  Phase II trial of cetuximab with or without paclitaxel in patients with advanced urothelial tract carcinoma. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[11]  I. Tannock,et al.  A phase I trial of pantoprazole in combination with doxorubicin in patients with advanced solid tumors: evaluation of pharmacokinetics of both drugs and tissue penetration of doxorubicin , 2014, Investigational New Drugs.

[12]  M. Okada,et al.  [New response evaluation criteria in solid tumours-revised RECIST guideline (version 1.1)]. , 2009, Gan to kagaku ryoho. Cancer & chemotherapy.

[13]  Heber MacMahon,et al.  Measurement of mesothelioma on thoracic CT scans: a comparison of manual and computer-assisted techniques. , 2004, Medical physics.

[14]  L. Broemeling,et al.  Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.