Worst-Case Local Boundary Precision in Global Measures of Segmentation Reproducibility

The commonly used measures for reproducibility of semiautomatic/interactive image segmentation algorithms provide global estimates of the precision of the location of an object boundary in a group of segmentations. The joint Dice similarity coefficient, joint Tanimoto coefficient, generalized Tanimoto coefficient, coefficient of variation of volume, and intra-class correlation coefficient of volume are interpreted with respect to a new explicit measure of worst-case local object boundary precision. Experiments established 95% confidence intervals on this new measure for ranges of global reproducibility measures allowing global measures to be interpreted in terms of worst-case local precision. Joint Tanimoto coefficient and joint Dice coefficient are shown to be highly unstable over variations in the number of segmentations being compared. All of the existing measures of segmentation reproducibility are found to be flawed in a significant way with the exception of the generalized Tanimoto coefficient.

[1]  T Stammberger,et al.  Interobserver reproducibility of quantitative cartilage measurements: comparison of B-spline snakes and manual segmentation. , 1999, Magnetic resonance imaging.

[2]  William E. Higgins,et al.  Interactive segmentation based on the live wire for 3D CT chest image analysis , 2007, International Journal of Computer Assisted Radiology and Surgery.

[3]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[4]  Benoit M. Dawant,et al.  Morphometric analysis of white matter lesions in MR images: method and validation , 1994, IEEE Trans. Medical Imaging.

[5]  J. Fleiss The design and analysis of clinical experiments , 1987 .

[6]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[7]  R. Kikinis,et al.  Automated segmentation of MR images of brain tumors. , 2001, Radiology.

[8]  Hersh Chandarana,et al.  Intraobserver and interobserver variability of renal volume measurements in polycystic kidney disease using a semiautomated MR segmentation algorithm. , 2012, AJR. American journal of roentgenology.

[9]  Oscar Camara,et al.  Generalized Overlap Measures for Evaluation and Validation in Medical Image Analysis , 2006, IEEE Transactions on Medical Imaging.

[10]  James R. MacFall,et al.  Accuracy and reproducibility of brain and tissue volumes using a magnetic resonance segmentation method , 1996, Psychiatry Research: Neuroimaging.

[11]  Guido Gerig,et al.  User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability , 2006, NeuroImage.

[12]  James Moss,et al.  Reproducibility of cutaneous thermal hyperaemia assessed by laser Doppler flowmetry in young and older adults. , 2011, Microvascular research.

[13]  Heinz-Otto Peitgen,et al.  Efficient Semiautomatic Segmentation of 3D Objects in Medical Images , 2000, MICCAI.

[14]  J Gil,et al.  Reproducibility and accuracy of interactive segmentation procedures for image analysis in cytology , 1997, Journal of microscopy.

[15]  Filippo Castrucci,et al.  Evaluation of reproducibility of spontaneous baroreflex sensitivity at rest and during laboratory tests , 1996, Journal of hypertension.

[16]  Yongmin Kim,et al.  Edge-guided boundary delineation in prostate ultrasound images , 2000, IEEE Transactions on Medical Imaging.