论文信息 - STAPLE performance assessed on crowdsourced sclera segmentations

STAPLE performance assessed on crowdsourced sclera segmentations

The Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm is frequently used in medical image segmentation without available ground truth (GT). In this paper, we investigate the number of inexperi- enced users required to establish a reliable STAPLE-based GT and the number of vertices the user’s shall place for a point-based segmentation. We employ “WeLineation”, a novel web-based system for crowdsourcing seg- mentations. Within the study, 2,060 masks have been delivered by 44 users on 75 different photographic images of the human eye, where users had to segment the sclera. For all masks, GT was estimated using STAPLE. Then, STAPLE is computed using fewer user contributions and results are compared to the GT. Requiring an error rate lower than 2%, same segmentation performance is obtained with 13 experienced and 22 rather inexperienced users. More than 10 vertices shall be placed on the delineation contour in order to reach an accuracy larger than 95%. In average, a vertex along the segmentation contour shall be placed every 81 pixels. The results indicate that knowledge about the users performance can reduce the number of segmentation masks per image, which are needed to estimate reliable GT. Therefore, gathering performance parameters of users during a crowdsourcing study and applying this information to the assignment process is recommended. In this way, benefits in the cost-effectiveness of a crowdsourcing segmentation study can be achieved.

[1] Thomas M. Deserno,et al. WeLineation: crowdsourcing delineations for reliable ground truth estimation , 2020, Medical Imaging.

[2] William M. Wells,et al. Validation of image segmentation by estimating rater bias and variance , 2008, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[3] Shelly Lotenberg,et al. Evaluation of uterine cervix segmentations using ground truth from multiple experts , 2009, Comput. Medical Imaging Graph..

[4] William M. Wells,et al. Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[5] Allan Hanbury,et al. Creating a Large-Scale Silver Corpus from Multiple Algorithmic Segmentations , 2015, MCV@MICCAI.

[6] Maximilien Vermandel,et al. Is STAPLE algorithm confident to assess segmentation methods in PET imaging? , 2015, Physics in medicine and biology.

[7] Torsten Rohlfing,et al. Extraction and Application of Expert Priors to Combine Multiple Segmentations of Human Brain Tissue , 2003, MICCAI.

[8] Thomas M. Deserno,et al. Deterioration of R-Wave Detection in Pathology and Noise: A Comprehensive Analysis Using Simultaneous Truth and Performance Level Estimation , 2017, IEEE Transactions on Biomedical Engineering.