Mixed-supervised segmentation: Confidence maximization helps knowledge distillation

Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks (DNNs) require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the applicability of DNNs in scenarios where annotated images are scarce. Mixed supervision is an appealing alternative for mitigating this obstacle. In this setting, only a small fraction of the data contains complete pixel-wise annotations and other images have a weaker form of supervision, e.g., only a handful of pixels are labeled. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a Kullback-Leibler (KL) divergence term, which transfers the knowledge (i.e., predictions) of the strongly supervised branch to the less-supervised branch and guides the entropy (studentconfidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. We evaluate the effectiveness of the proposed formulation through a series of quantitative and qualitative experiments using two publicly available datasets. Results demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Moreover, in line with recent observations in classification, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter. Our code is publicly available: https://github.com/josedolz/MSL-student-becomes-master. © 2021 Elsevier B. V. All rights reserved. ∗Corresponding author: bingyuan.Liu@etsmtl.ca ∗∗Corresponding author: jose.dolz@etsmtl.ca

[1]  Wengang Zhou,et al.  ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  M. Pedersoli,et al.  Boosting Semi-supervised Image Segmentation with Global and Local Mutual Information Regularization , 2021, Machine Learning for Biomedical Imaging.

[3]  Jose Dolz,et al.  Teach me to segment with mixed supervision: Confident students become masters , 2020, IPMI.

[4]  Jizong Peng,et al.  Self-paced and self-consistent co-training for semi-supervised image segmentation , 2020, Medical Image Anal..

[5]  Isabelle Bloch,et al.  Knowledge Distillation from Multi-modal to Mono-modal Segmentation Networks , 2020, MICCAI.

[6]  Ismail Ben Ayed,et al.  Source-Relaxed Domain Adaptation for Image Segmentation , 2020, MICCAI.

[7]  Jizong Peng,et al.  Mutual information deep regularization for semi-supervised segmentation , 2020, MIDL.

[8]  Jose Dolz,et al.  Bounding boxes for weakly supervised segmentation: Global constraints get close to full supervision , 2020, MIDL.

[9]  Stefano Soatto,et al.  A Baseline for Few-Shot Image Classification , 2019, ICLR.

[10]  Jizong Peng,et al.  Discretely-constrained deep network for weakly supervised segmentation , 2019, Neural Networks.

[11]  Jizong Peng,et al.  Deep Co-Training for Semi-Supervised Image Segmentation , 2019, Pattern Recognit..

[12]  Dinggang Shen,et al.  Deep CNN ensembles and suggestive annotations for infant brain MRI segmentation , 2017, Comput. Medical Imaging Graph..

[13]  Meng Yang,et al.  Semi-supervised Semantic Segmentation via Strong-Weak Dual-Branch Network , 2020, ECCV.

[14]  Pablo Piantanida,et al.  Information Maximization for Few-Shot Learning , 2020, NeurIPS.

[15]  Marleen de Bruijne,et al.  Semi-supervised Medical Image Segmentation via Learning Consistency Under Transformations , 2019, MICCAI.

[16]  Gadi Wollstein,et al.  Uncertainty Guided Semi-supervised Segmentation of Retinal Layers in OCT Images , 2019, MICCAI.

[17]  Ming Li,et al.  Mixed-Supervised Dual-Network for Medical Image Segmentation , 2019, MICCAI.

[18]  Bo Du,et al.  Self-Ensembling Attention Networks: Addressing Domain Shift for Semantic Segmentation , 2019, AAAI.

[19]  Chi-Wing Fu,et al.  Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation , 2019, MICCAI.

[20]  Peter Schlicht,et al.  On the Robustness of Redundant Teacher-Student Frameworks for Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[22]  Eric Granger,et al.  Curriculum semi-supervised segmentation , 2019, MICCAI.

[23]  Yiming Li,et al.  Semi-Supervised Brain Lesion Segmentation with an Adapted Mean Teacher Model , 2019, IPMI.

[24]  Sungroh Yoon,et al.  FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ender Konukoglu,et al.  Semi-Supervised and Task-Driven Data Augmentation , 2019, IPMI.

[26]  Wei Shen,et al.  Semi-Supervised 3D Abdominal Multi-Organ Segmentation Via Deep Multi-Planar Co-Training , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27]  Hervé Delingette,et al.  Deep learning with mixed supervision for brain tumor segmentation , 2018, Journal of medical imaging.

[28]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Eric Granger,et al.  Constrained‐CNN losses for weakly supervised segmentation☆ , 2018, Medical Image Anal..

[30]  Suyash P. Awate,et al.  Annotation-cost Minimization for Medical Image Segmentation using Suggestive Mixed Supervision Fully Convolutional Networks , 2018, ArXiv.

[31]  S. N. Merchant,et al.  MS-Net: Mixed-Supervision Fully-Convolutional Networks for Full-Resolution Segmentation , 2018, MICCAI.

[32]  C. Perone,et al.  Deep semi-supervised segmentation with weight-averaged consistency targets , 2018, DLMIA/ML-CDS@MICCAI.

[33]  Xin Yang,et al.  Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? , 2018, IEEE Transactions on Medical Imaging.

[34]  Zachary Chase Lipton,et al.  Born Again Neural Networks , 2018, ICML.

[35]  Jose Dolz,et al.  3D fully convolutional networks for subcortical segmentation in MRI: A large-scale study , 2016, NeuroImage.

[36]  Ben Glocker,et al.  Semi-supervised Learning for Network-Based Cardiac MR Image Segmentation , 2017, MICCAI.

[37]  Lin Yang,et al.  Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images , 2017, MICCAI.

[38]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Nassir Navab,et al.  Semi-supervised Deep Learning for Fully Convolutional Networks , 2017, MICCAI.

[40]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[41]  Konstantinos Kamnitsas,et al.  DeepCut: Object Segmentation From Bounding Box Annotations Using Convolutional Neural Networks , 2016, IEEE Transactions on Medical Imaging.

[42]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[44]  Jian Sun,et al.  ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Seunghoon Hong,et al.  Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation , 2015, NIPS.

[47]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[48]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[49]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[50]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[51]  Daniel Rueckert,et al.  Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part II , 2017, Lecture Notes in Computer Science.

[52]  Yoshua Bengio,et al.  Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[53]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .