Theoretical analysis and experimental validation of volume bias of soft Dice optimized segmentation maps in the context of inherent uncertainty

The clinical interest is often to measure the volume of a structure, which is typically derived from a segmentation. In order to evaluate and compare segmentation methods, the similarity between a segmentation and a predefined ground truth is measured using popular discrete metrics, such as the Dice score. Recent segmentation methods use a differentiable surrogate metric, such as soft Dice, as part of the loss function during the learning phase. In this work, we first briefly describe how to derive volume estimates from a segmentation that is, potentially, inherently uncertain or ambiguous. This is followed by a theoretical analysis and an experimental validation linking the inherent uncertainty to common loss functions for training CNNs, namely cross-entropy and soft Dice. We find that, even though soft Dice optimization leads to an improved performance with respect to the Dice score and other measures, it may introduce a volume bias for tasks with high inherent uncertainty. These findings indicate some of the method's clinical limitations and suggest doing a closer ad-hoc volume analysis with an optional re-calibration step.

[1]  Ender Konukoglu,et al.  PHiSeg: Capturing Uncertainty in Medical Image Segmentation , 2019, MICCAI.

[2]  R. Meier,et al.  Fully automated brain resection cavity delineation for radiation target volume definition in glioblastoma patients using deep learning , 2020, Radiation oncology.

[3]  Christos Davatzikos,et al.  Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features , 2017, Scientific Data.

[4]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  A. Demchuk,et al.  Endovascular thrombectomy after large-vessel ischaemic stroke: a meta-analysis of individual patient data from five randomised trials , 2016, The Lancet.

[7]  Maximilian Baust,et al.  Learning in an Uncertain World: Representing Ambiguity Through Multiple Hypotheses , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[9]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[10]  R. von Kummer,et al.  Imaging of cerebral ischemic edema and neuronal death , 2017, Neuroradiology.

[11]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[12]  Konstantinos Kamnitsas,et al.  Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation , 2016, Medical Image Anal..

[13]  Matthew B. Blaschko,et al.  Optimization for Medical Image Segmentation: Theory and Practice When Evaluating With Dice Score or Jaccard Index , 2020, IEEE Transactions on Medical Imaging.

[14]  Paul Suetens,et al.  Optimization with soft Dice can lead to a volumetric bias , 2019, BrainLes@MICCAI.

[15]  Christopher Joseph Pal,et al.  The Importance of Skip Connections in Biomedical Image Segmentation , 2016, LABELS/DLMIA@MICCAI.

[16]  Simon Andermatt,et al.  Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge , 2019, IEEE Transactions on Medical Imaging.

[17]  Paul Suetens,et al.  Prediction of final infarct volume from native CT perfusion and treatment parameters using deep learning , 2018, Medical Image Anal..

[18]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[19]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[20]  Klaus H. Maier-Hein,et al.  A Probabilistic U-Net for Segmentation of Ambiguous Images , 2018, NeurIPS.

[21]  et al.,et al.  Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge , 2018, ArXiv.

[22]  Sébastien Ourselin,et al.  Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations , 2017, DLMIA/ML-CDS@MICCAI.

[23]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[24]  Martin Styner,et al.  Hippocampal shape analysis in Alzheimer's disease and frontotemporal lobar degeneration subtypes. , 2012, Journal of Alzheimer's disease : JAD.

[25]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[26]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[27]  Bernt Schiele,et al.  Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  J De Tobel,et al.  An automated technique to stage lower third molar development on panoramic radiographs for age estimation: a pilot study. , 2017, The Journal of forensic odonto-stomatology.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Andrew L Beers,et al.  ISLES 2016 and 2017-Benchmarking Ischemic Stroke Lesion Outcome Prediction Based on Multispectral MRI , 2018, Front. Neurol..

[31]  Victor Alves,et al.  Stroke Lesion Outcome Prediction Based on MRI Imaging Combined With Clinical Information , 2018, Front. Neurol..

[32]  Luca Maria Gambardella,et al.  Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.

[33]  Matthew B. Blaschko,et al.  Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory and Practice , 2019, MICCAI.

[34]  Benoit M. Dawant,et al.  Morphometric analysis of white matter lesions in MR images: method and validation , 1994, IEEE Trans. Medical Imaging.

[35]  Mauricio Reyes,et al.  Assessing Reliability and Challenges of Uncertainty Estimations for Medical Image Segmentation , 2019, MICCAI.

[36]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[38]  Klaus H. Maier-Hein,et al.  No New-Net , 2018, 1809.10483.

[39]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.