论文信息 - Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation - 字舞流文

Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation

Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the “ground-truth” label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and aﬀected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for diﬀerences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for diﬀerent annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across diﬀerent datasets in order to permit their eﬀective aggregation, and (3) ﬁne-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with speciﬁc image features, potentially enabling detection biases to be more easily identiﬁed.

Jean-Pierre R. Falet | T. Arbel | S. Tsaftaris | Raghav Mehta | Douglas Arnold | B. Nichyporuk | Jillian Cardinell | Justin Szeto

[1] Multi-CartoonGAN with Conditional Adaptive Instance-Layer Normalization for Conditional Artistic Face Translation , 2022, AI.

[2] Christos Davatzikos,et al. Embracing the disharmony in medical imaging: A Simple and effective framework for domain adaptation , 2021, Medical Image Anal..

[3] Yong Xia,et al. Modeling annotator preference and stochastic annotation error for medical image segmentation , 2021, Medical Image Anal..

[4] Anh Tuan Tran,et al. Exploiting Domain-Specific Features to Enhance Domain Generalization , 2021, NeurIPS.

[5] R. Sayres,et al. Iterative Quality Control Strategies for Expert Medical Image Labeling , 2021, HCOMP.

[6] M. Jenkinson,et al. Opportunities for Understanding MS Mechanisms and Progression With MRI Using Large-Scale Data Sharing and Artificial Intelligence , 2021, Neurology.

[7] Q. M. Wu,et al. D-BIN: A Generalized Disentangling Batch Instance Normalization for Domain Adaptation , 2021, IEEE Transactions on Cybernetics.

[8] D. Collins,et al. Diffusely abnormal white matter converts to T2 lesion volume in the absence of acute inflammation , 2021, bioRxiv.

[9] T. Arbel,et al. Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation , 2021, DART/FAIR@MICCAI.

[10] Qi Bi,et al. Learning Calibrated Medical Image Segmentation via Multi-rater Agreement Modeling , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Julien Cohen-Adad,et al. Impact of individual rater style on deep learning uncertainty in medical imaging segmentation , 2021, ArXiv.

[12] Hailin Jin,et al. ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Sotirios A. Tsaftaris,et al. INSIDE: Steering Spatial Attention with Non-Imaging Information in CNNs , 2020, MICCAI.

[14] O. Ciccarelli,et al. Disentangling Human Error from the Ground Truth in Segmentation of Medical Images , 2020, NeurIPS 2020.

[15] X. Montalban,et al. Treatment Optimization in Multiple Sclerosis: Canadian MS Working Group Recommendations , 2020, Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques.

[16] Yoonsik Kim,et al. Transfer Learning From Synthetic to Real-Noise Denoising With Adaptive Instance Normalization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Joseph Paul Cohen,et al. On the limits of cross-domain generalization in automated X-ray prediction , 2020, MIDL.

[18] S. Warfield,et al. Deep learning with noisy labels: exploring techniques and remedies in medical image analysis , 2019, Medical Image Anal..

[19] Jakub M. Tomczak,et al. DIVA: Domain Invariant Variational Autoencoders , 2019, DGS@ICLR.

[20] Bernhard Kainz,et al. Exploring the Relationship Between Segmentation Uncertainty, Segmentation Performance and Inter-observer Variability with Probabilistic Networks , 2019, LABELS/HAL-MICCAI/CuRIOUS@MICCAI.

[21] Luke Oakden-Rayner,et al. Exploring large scale public medical image datasets , 2019, Academic radiology.

[22] Mitko Veta,et al. Learning Domain-Invariant Representations of Histological Images , 2019, Front. Med..

[23] Yan Shen,et al. Brain Tumor Segmentation on MRI with Missing Modalities , 2019, IPMI.

[24] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[25] Klaus H. Maier-Hein,et al. No New-Net , 2018, 1809.10483.

[26] L. Joskowicz,et al. Inter-observer variability of manual contour delineation of structures in CT , 2018, European Radiology.

[27] Doina Precup,et al. Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation , 2018, MICCAI.

[28] Martin Styner,et al. Objective Evaluation of Multiple Sclerosis Lesion Segmentation using a Data Management and Processing Infrastructure , 2018, bioRxiv.

[29] Nikolaos Papanikolopoulos,et al. Imperfect Segmentation Labels: How Much Do They Matter? , 2018, CVII-STENT/LABELS@MICCAI.

[30] Mauricio Reyes,et al. On the Effect of Inter-observer Variability for a Reliable Estimation of Uncertainty of Medical Image Segmentation , 2018, MICCAI.

[31] Ender Konukoglu,et al. A Lifelong Learning Approach to Brain MR Segmentation Across Scanners and Protocols , 2018, MICCAI.

[32] Hyo-Eun Kim,et al. Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks , 2018, NeurIPS.

[33] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[34] D. Reich,et al. Volumetric Analysis from a Harmonized Multisite Brain MRI Study of a Single Subject with Multiple Sclerosis , 2017, American Journal of Neuroradiology.

[35] Konstantinos Kamnitsas,et al. Unsupervised domain adaptation in brain lesion segmentation with adversarial networks , 2016, IPMI.

[36] Jonathon Shlens,et al. A Learned Representation For Artistic Style , 2016, ICLR.

[37] Anisha Keshavan,et al. Intra- and interscanner variability of magnetic resonance imaging based volumetry in multiple sclerosis , 2016, NeuroImage.

[38] Mohammad Havaei,et al. HeMIS: Hetero-Modal Image Segmentation , 2016, MICCAI.

[39] Kate Saenko,et al. Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[40] Marleen de Bruijne,et al. Transfer Learning Improves Supervised Image Segmentation Across Imaging Protocols , 2015, IEEE Trans. Medical Imaging.

[41] F. Jacques. Defining the clinical course of multiple sclerosis: The 2013 revisions , 2015, Neurology.

[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43] J. Hilden,et al. Observer bias in randomized clinical trials with measurement scale outcomes: a systematic review of trials with both blinded and nonblinded assessors , 2013, Canadian Medical Association Journal.

[44] Elizabeth Fisher,et al. Reliability of classifying multiple sclerosis disease activity using magnetic resonance imaging in a multiple sclerosis clinic. , 2013, JAMA neurology.

[45] Jerry L. Prince,et al. Foibles, follies, and fusion: Web-based collaboration for medical image labeling , 2012, NeuroImage.

[46] B. Hurwitz. The diagnosis of multiple sclerosis and the clinical subtypes , 2009, Annals of Indian Academy of Neurology.

[47] Frederik Barkhof,et al. Diffusely abnormal white matter in chronic multiple sclerosis: imaging and histopathologic analysis. , 2009, Archives of neurology.

[48] D. Powers. Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation , 2008 .

[49] William M. Wells,et al. Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[50] Andrew Zisserman,et al. Estimation of the partial volume effect in MRI , 2002, Medical Image Anal..