Practical no-gold-standard evaluation framework for quantitative imaging methods: application to lesion segmentation in positron emission tomography

Abstract. Recently, a class of no-gold-standard (NGS) techniques have been proposed to evaluate quantitative imaging methods using patient data. These techniques provide figures of merit (FoMs) quantifying the precision of the estimated quantitative value without requiring repeated measurements and without requiring a gold standard. However, applying these techniques to patient data presents several practical difficulties including assessing the underlying assumptions, accounting for patient-sampling-related uncertainty, and assessing the reliability of the estimated FoMs. To address these issues, we propose statistical tests that provide confidence in the underlying assumptions and in the reliability of the estimated FoMs. Furthermore, the NGS technique is integrated within a bootstrap-based methodology to account for patient-sampling-related uncertainty. The developed NGS framework was applied to evaluate four methods for segmenting lesions from F-Fluoro-2-deoxyglucose positron emission tomography images of patients with head-and-neck cancer on the task of precisely measuring the metabolic tumor volume. The NGS technique consistently predicted the same segmentation method as the most precise method. The proposed framework provided confidence in these results, even when gold-standard data were not available. The bootstrap-based methodology indicated improved performance of the NGS technique with larger numbers of patient studies, as was expected, and yielded consistent results as long as data from more than 80 lesions were available for the analysis.

[1]  Paul E Kinahan,et al.  A Virtual Clinical Trial of FDG-PET Imaging of Breast Cancer: Effect of Variability on Response Assessment. , 2014, Translational oncology.

[2]  Kyung Hoon Hwang,et al.  Prognostic Value of Metabolic Tumor Volume Estimated by 18 F-FDG Positron Emission Tomography/Computed Tomography in Patients with Diffuse Large B-Cell Lymphoma of Stage II or III Disease , 2014, Nuclear Medicine and Molecular Imaging.

[3]  Abhinav K. Jha,et al.  A maximum‐likelihood method to estimate a single ADC value of lesions using diffusion MRI , 2016, Magnetic resonance in medicine.

[4]  Frédérique Frouin,et al.  Nonsupervised Ranking of Different Segmentation Approaches: Application to the Estimation of the Left Ventricular Ejection Fraction From Cardiac Cine MRI Sequences , 2012, IEEE Transactions on Medical Imaging.

[5]  Matthew A Kupinski,et al.  Evaluating segmentation algorithms for diffusion-weighted MR images: a task-based approach , 2010, Medical Imaging.

[6]  William M. Wells,et al.  Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[7]  Eric Clarkson,et al.  Comparing cardiac ejection fraction estimation algorithms without a gold standard. , 2006, Academic radiology.

[8]  R. Wahl,et al.  FDG PET/CT Imaging of Oropharyngeal Squamous Cell Carcinoma: Characteristics of Human Papillomavirus–Positive and –Negative Tumors , 2014, Clinical nuclear medicine.

[9]  Habib Zaidi,et al.  Comparative methods for PET image segmentation in pharyngolaryngeal squamous cell carcinoma , 2010, European Journal of Nuclear Medicine and Molecular Imaging.

[10]  Timothy M Pawlik,et al.  Prognostic Value of FDG PET/CT-Derived Parameters in Pancreatic Adenocarcinoma at Initial PET/CT Staging. , 2014, AJR. American journal of roentgenology.

[11]  Jong-Jin Yun,et al.  A retrospective single-center study comparing clinical outcomes of 3-dimensional and 2-dimensional laparoscopic cholecystectomy in acute cholecystitis , 2019, Annals of hepato-biliary-pancreatic surgery.

[12]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[13]  J. Xian,et al.  Role of Quantitative Magnetic Resonance Imaging Parameters in the Evaluation of Treatment Response in Malignant Tumors , 2015, Chinese medical journal.

[14]  Jayaram K. Udupa,et al.  A framework for evaluating image segmentation algorithms , 2006, Comput. Medical Imaging Graph..

[15]  G Dunn,et al.  Modelling method comparison data , 1999, Statistical methods in medical research.

[16]  Samuel Chang,et al.  Predictive value of repeated F-18 FDG PET/CT parameters changes during preoperative chemoradiotherapy to predict pathologic response and overall survival in locally advanced esophageal adenocarcinoma patients , 2016, Cancer Chemotherapy and Pharmacology.

[17]  Eric Clarkson,et al.  Estimation in medical imaging without a gold standard. , 2002, Academic radiology.

[18]  Nicholas Petrick,et al.  Quantitative imaging to assess tumor response to therapy: common themes of measurement, truth data, and error sources. , 2009, Translational oncology.

[19]  Maximilien Vermandel,et al.  Evaluation of PET volume segmentation methods: comparisons with expert manual delineations , 2012, Nuclear medicine communications.

[20]  Abhinav K. Jha,et al.  A no-gold-standard technique for objective assessment of quantitative nuclear-medicine imaging methods , 2016, Physics in medicine and biology.

[21]  R. Köhler,et al.  The International Vocabulary of Metrology, 3rd Edition: Basic and General Concepts and Associated Terms. Why? How? , 2010 .

[22]  Gustavo Mercier,et al.  Interreader agreement and variability of FDG PET volumetric parameters in human solid tumors. , 2014, AJR. American journal of roentgenology.

[23]  Geoffrey McLennan,et al.  PET/CT Assessment of Response to Therapy: Tumor Change Measurement, Truth Data, and Error. , 2009, Translational oncology.

[24]  Han-Soo Kim,et al.  Clinical outcome prediction of percutaneous cementoplasty for metastatic bone tumor using 18F-FDG PET-CT , 2013, Annals of Nuclear Medicine.

[25]  Timo Kohlberger,et al.  Evaluating Segmentation Error without Ground Truth , 2012, MICCAI.

[26]  Mithat Gönen,et al.  Evaluation of Different Methods of 18F-FDG-PET Target Volume Delineation in the Radiotherapy of Head and Neck Cancer , 2008, American journal of clinical oncology.

[27]  Timothy Cooley,et al.  Head and neck squamous cell cancer (stages III and IV) induction chemotherapy assessment: Value of FDG volumetric imaging parameters , 2014, Journal of medical imaging and radiation oncology.

[28]  Dimitris Visvikis,et al.  Performance of automatic image segmentation algorithms for calculating total lesion glycolysis for early response monitoring in non-small cell lung cancer patients during concomitant chemoradiotherapy. , 2016, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[29]  Ulas Bagci,et al.  A review on segmentation of positron emission tomography images , 2014, Comput. Biol. Medicine.

[30]  Kyle J Myers,et al.  Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons , 2014, Statistical methods in medical research.

[31]  Abhinav K. Jha,et al.  18F-FDG PET/CT Metabolic Tumor Volume and Intratumoral Heterogeneity in Pancreatic Adenocarcinomas: Impact of Dual–Time Point and Segmentation Methods , 2017, Clinical nuclear medicine.

[32]  Eric C Frey,et al.  Objective evaluation of reconstruction methods for quantitative SPECT imaging in the absence of ground truth , 2015, Medical Imaging.

[33]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[34]  Rathan M. Subramaniam,et al.  18F-FDG Metabolic Tumor Volume and Total Glycolytic Activity of Oral Cavity and Oropharyngeal Squamous Cell Cancer: Adding Value to Clinical Staging , 2012, The Journal of Nuclear Medicine.

[35]  Thomas E Yankeelov,et al.  Methods and challenges in quantitative imaging biomarker development. , 2015, Academic radiology.

[36]  Fabrice Denis,et al.  Early Assessment of Metabolic Response by 18F-FDG PET During Concomitant Radiochemotherapy of Non–Small Cell Lung Carcinoma Is Associated With Survival: A Retrospective Single-Center Study , 2015, Clinical nuclear medicine.

[37]  Quynh-Thu Le,et al.  Correlation between metabolic tumor volume and pathologic tumor volume in squamous cell carcinoma of the oral cavity. , 2011, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[38]  Federico Turkheimer,et al.  Importance of Quantification for the Analysis of PET Data in Oncology: Review of Current Methods and Trends for the Future , 2012, Molecular Imaging and Biology.

[39]  Daniel C. Alexander,et al.  Interactive Lesion Segmentation with Shape Priors From Offline and Online Learning , 2012, IEEE Transactions on Medical Imaging.

[40]  Thomas Carlier,et al.  State-Of-The-Art and Recent Advances in Quantification for Therapeutic Follow-Up in Oncology Using PET , 2015, Front. Med..

[41]  John Seibyl,et al.  SNM Practice Guideline for Dopamine Transporter Imaging with 123I-Ioflupane SPECT 1.0* , 2012, The Journal of Nuclear Medicine.

[42]  Matthew A Kupinski,et al.  Diffusion MRI with Semi-Automated Segmentation Can Serve as a Restricted Predictive Biomarker of the Therapeutic Response of Liver Metastasis. , 2015, Magnetic resonance imaging.

[43]  R. Subramaniam,et al.  Intra-reader reliability of FDG PET volumetric tumor parameters: effects of primary tumor size and segmentation methods , 2012, Annals of Nuclear Medicine.

[44]  Abhinav K. Jha,et al.  Value of Intratumoral Metabolic Heterogeneity and Quantitative 18F-FDG PET/CT Parameters to Predict Prognosis in Patients With HPV-Positive Primary Oropharyngeal Squamous Cell Carcinoma , 2017, Clinical nuclear medicine.

[45]  Byeong-Cheol Ahn,et al.  Prognostic implications of metabolic tumor volume on 18F-FDG PET/CT in diffuse large B-cell lymphoma patients with extranodal involvement , 2015 .

[46]  H. Barnhart,et al.  The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions , 2015, Statistical methods in medical research.

[47]  Ho-Jin Shin,et al.  Prognostic value of metabolic tumor volume on PET / CT in primary gastrointestinal diffuse large B cell lymphoma , 2012, Cancer science.

[48]  Dong Soo Lee,et al.  Total lesion glycolysis in positron emission tomography is a better predictor of outcome than the International Prognostic Index for patients with diffuse large B cell lymphoma , 2013, Cancer.

[49]  Matthew A Kupinski,et al.  Task-based evaluation of segmentation algorithms for diffusion-weighted MRI without using a gold standard , 2012, Physics in medicine and biology.

[50]  Anne Bol,et al.  A gradient-based method for segmenting FDG-PET images: methodology and validation , 2007, European Journal of Nuclear Medicine and Molecular Imaging.

[51]  Matthew A. Kupinski,et al.  Objective Comparison of Quantitative Imaging Modalities Without the Use of a Gold Standard , 2001, IPMI.

[52]  Gustavo Mercier,et al.  FDG PET metabolic tumor volume segmentation and pathologic volume of primary human solid tumors. , 2014, AJR. American journal of roentgenology.

[53]  Abhinav K Jha,et al.  A clustering algorithm for liver lesion segmentation of diffusion-weighted MR images , 2010, 2010 IEEE Southwest Symposium on Image Analysis & Interpretation (SSIAI).