The Medical Segmentation Decathlon

International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem. We hypothesized that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. To investigate the hypothesis, we organized the Medical Segmentation Decathlon (MSD)—a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities. The underlying data set was designed to explore the axis of difficulties typically encountered when dealing with medical images, such as small data sets, unbalanced labels, multi-site data and small objects. The MSD challenge confirmed that algorithms with a consistent good performance on a set of tasks preserved their good average performance on a different set of previously unseen tasks. Moreover, by monitoring the MSD winner for two years, we found that this algorithm continued generalizing well to a wide range of other clinical problems, further confirming our hypothesis. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms are mature, accurate, and generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to non AI experts.

[1]  Jürgen Weese,et al.  Benchmark for Algorithms Segmenting the Left Atrium From 3D CT and MRI Datasets , 2015, IEEE Transactions on Medical Imaging.

[2]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[3]  Ronald M. Summers,et al.  Common Limitations of Image Processing Metrics: A Picture Story , 2021, ArXiv.

[4]  Jun Ma,et al.  Cutting-edge 3D Medical Image Segmentation Methods in 2020: Are Happy Families All Alike? , 2021, ArXiv.

[5]  B. van Ginneken,et al.  Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis , 2016, Scientific Reports.

[6]  Nassir Navab,et al.  QuickNAT: Segmenting MRI Neuroanatomy in 20 seconds , 2018, ArXiv.

[7]  Daguang Xu,et al.  DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Michael V. McConnell,et al.  Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning , 2017, Nature Biomedical Engineering.

[9]  Fan Tang,et al.  Deep-learning-based detection and segmentation of organs at risk in nasopharyngeal carcinoma computed tomographic images for radiotherapy planning , 2018, European Radiology.

[10]  Akshay Pai,et al.  One Network to Segment Them All: A General, Lightweight System for Accurate 3D Medical Image Segmentation , 2019, MICCAI.

[11]  Nicholas Ayache,et al.  20th anniversary of the Medical Image Analysis journal (MedIA). , 2016, Medical image analysis.

[12]  Kaiyong Zhao,et al.  AutoML: A Survey of the State-of-the-Art , 2019, Knowl. Based Syst..

[13]  et al.,et al.  Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge , 2018, ArXiv.

[14]  Dorit Merhof,et al.  AutoML Segmentation for 3D Medical Image Data: Contribution to the MSD Challenge 2018 , 2020, ArXiv.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Dawit Assefa,et al.  Robust texture features for response monitoring of glioblastoma multiforme on T1-weighted and T2-FLAIR MR images: a preliminary investigation in terms of identification and segmentation. , 2010, Medical physics.

[17]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[18]  Jens Petersen,et al.  nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation , 2020, Nature Methods.

[19]  Bram Stieltjes,et al.  Enhancing pancreatic adenocarcinoma delineation in diffusion derived intravoxel incoherent motion f‐maps through automatic vessel and duct segmentation , 2011, Magnetic resonance in medicine.

[20]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[21]  Liansheng Wang,et al.  Nested Dilation Network (NDN) for Multi-Task Medical Image Segmentation , 2019, IEEE Access.

[22]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[23]  Konstantinos Kamnitsas,et al.  DeepMedic for Brain Tumor Segmentation , 2016, BrainLes@MICCAI.

[24]  L. Joskowicz,et al.  Inter-observer variability of manual contour delineation of structures in CT , 2018, European Radiology.

[25]  Aaron Carass,et al.  Why rankings of biomedical image analysis competitions should be interpreted with care , 2018, Nature Communications.

[26]  Christos Davatzikos,et al.  Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features , 2017, Scientific Data.

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Hao Chen,et al.  The Liver Tumor Segmentation Benchmark (LiTS) , 2019, Medical Image Anal..

[29]  Ronald M. Summers,et al.  A large annotated medical image dataset for the development and evaluation of segmentation algorithms , 2019, ArXiv.

[30]  Geraint Rees,et al.  Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy , 2018, ArXiv.

[31]  Lena Maier-Hein,et al.  How to Exploit Weaknesses in Biomedical Challenge Design and Organization , 2018, MICCAI.

[32]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[33]  Lena Maier-Hein,et al.  Methods and open-source toolkit for analyzing and visualizing challenge results , 2019, Scientific reports.

[34]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[35]  Dong Yang,et al.  3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[36]  Ekin D. Cubuk,et al.  Revisiting ResNets: Improved Training and Scaling Strategies , 2021, NeurIPS.

[37]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Lena Maier-Hein,et al.  BIAS: Transparent reporting of biomedical image analysis challenges , 2019, Medical Image Analysis.

[40]  Elena Casiraghi,et al.  Liver segmentation from computed tomography scans: A survey and a new algorithm , 2009, Artif. Intell. Medicine.

[41]  Hao Chen,et al.  Gland segmentation in colon histology images: The glas challenge contest , 2016, Medical Image Anal..