Segmentation with Multiple Acceptable Annotations: A Case Study of Myocardial Segmentation in Contrast Echocardiography

Most existing deep learning-based frameworks for image segmentation assume that a unique ground truth is known and can be used for performance evaluation. This is true for many applications, but not all. Myocardial segmentation of Myocardial Contrast Echocardiography (MCE), a critical task in automatic myocardial perfusion analysis, is an example. Due to the low resolution and serious artifacts in MCE data, annotations from different cardiologists can vary significantly, and it is hard to tell which one is the best. In this case, how can we find a good way to evaluate segmentation performance and how do we train the neural network? In this paper, we address the first problem by proposing a new extended Dice to effectively evaluate the segmentation performance when multiple accepted ground truth is available. Then based on our proposed metric, we solve the second problem by further incorporating the new metric into a loss function that enables neural networks to flexibly learn general features of myocardium. Experiment results on our clinical MCE data set demonstrate that the neural network trained with the proposed loss function outperforms those existing ones that try to obtain a unique ground truth from multiple annotations, both quantitatively and qualitatively. Finally, our grading study shows that using extended Dice as an evaluation metric can better identify segmentation results that need manual correction compared with using Dice.

[1]  William M. Wells,et al.  Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[2]  Constantine Butakoff,et al.  Simulated 3D ultrasound LV cardiac images for active shape model training , 2007, SPIE Medical Imaging.

[3]  Yiyu Shi,et al.  Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  H. Rolf Jäger,et al.  Let's agree to disagree: learning highly debatable multirater labelling , 2019, MICCAI.

[5]  H. Hricak,et al.  Intra- and interobserver variability in CT measurements in oncology. , 2013, Radiology.

[6]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[7]  Klaus H. Maier-Hein,et al.  nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation , 2018, Bildverarbeitung für die Medizin.

[8]  Paul A. Yushkevich,et al.  Multi-Atlas Segmentation with Joint Label Fusion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Swami Sankaranarayanan,et al.  Learning From Noisy Labels by Regularized Estimation of Annotator Confusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  D O Cosgrove,et al.  Quantitative contrast-enhanced ultrasound imaging: a review of sources of variability , 2011, Interface Focus.

[11]  Ziv Yaniv,et al.  SimpleITK Image-Analysis Notebooks: a Collaborative Environment for Education and Reproducible Research , 2017, Journal of Digital Imaging.

[12]  Harald Becher,et al.  Clinical Applications of Ultrasonic Enhancing Agents in Echocardiography: 2018 American Society of Echocardiography Guidelines Update. , 2018, Journal of the American Society of Echocardiography : official publication of the American Society of Echocardiography.

[13]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[14]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[15]  Jian Zhuang,et al.  Whole Heart and Great Vessel Segmentation in Congenital Heart Disease Using Deep Neural Networks and Graph Matching , 2019, MICCAI.

[16]  M. Kachelriess,et al.  Clinical quantitative cardiac imaging for the assessment of myocardial ischaemia , 2020, Nature Reviews Cardiology.

[17]  Andreas Makris,et al.  Inter‐ and intraobserver variability in the evaluation of dynamic breast cancer MRI , 2006, Journal of magnetic resonance imaging : JMRI.