How Reliable Are Out-of-Distribution Generalization Methods for Medical Image Segmentation?

The recent achievements of Deep Learning rely on the test data being similar in distribution to the training data. In an ideal case, Deep Learning models would achieve Out-of-Distribution (OoD) Generalization, i.e. reliably make predictions on out-of-distribution data. Yet in practice, models usually fail to generalize well when facing a shift in distribution. Several methods were thereby designed to improve the robustness of the features learned by a model through Regularizationor Domain-Prediction-based schemes. Segmenting medical images such as MRIs of the hippocampus is essential for the diagnosis and treatment of neuropsychiatric disorders. But these brain images often suffer from distribution shift due to the patient’s age and various pathologies affecting the shape of the organ. In this work, we evaluate OoD Generalization solutions for the problem of hippocampus segmentation in MR data using both fullyand semi-supervised training. We find that no method performs reliably in all experiments. Only the V-REx loss stands out as it remains easy to tune, while it outperforms a standard U-Net in most cases.

[1]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[2]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[3]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, International Conference on Machine Learning.

[4]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.

[5]  Amit Dhurandhar,et al.  Invariant Risk Minimization Games , 2020, ICML.

[6]  Ronald M. Summers,et al.  A large annotated medical image dataset for the development and evaluation of segmentation algorithms , 2019, ArXiv.

[7]  D. Louis Collins,et al.  Training labels for hippocampal segmentation based on the EADC-ADNI harmonized hippocampal protocol , 2015, Alzheimer's & Dementia.

[8]  Andrea Bernasconi,et al.  Multi-contrast submillimetric 3 Tesla hippocampal subfield segmentation protocol and dataset , 2015, Scientific Data.

[9]  Mark Jenkinson,et al.  Deep Learning-Based Unlearning of Dataset Bias for MRI Harmonisation and Confound Removal , 2020 .

[10]  Daniel C. Castro,et al.  Causality matters in medical imaging , 2019, Nature Communications.

[11]  Paul M. Thompson,et al.  Age effects on hippocampal structural changes in old men: The HAAS , 2008, NeuroImage.

[12]  Yanfeng Wang,et al.  Dual-Task Self-supervision for Cross-modality Domain Adaptation , 2020, MICCAI.

[13]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[14]  C. Yasuda,et al.  Hippocampus segmentation on epilepsy and Alzheimer's disease studies with multiple convolutional neural networks☆ , 2020, Heliyon.

[15]  Dinggang Shen,et al.  Dilated Dense U-Net for Infant Hippocampus Subfield Segmentation , 2019, Front. Neuroinform..