Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

In this paper, we study the semi-supervised semantic segmentation problem via exploring both labeled data and extra unlabeled data. We propose a novel consistency regularization approach, called cross pseudo supervision (CPS). Our approach imposes the consistency on two segmentation networks perturbed with different initialization for the same input image. The pseudo one-hot label map, output from one perturbed segmentation network, is used to supervise the other segmentation network with the standard cross-entropy loss, and vice versa. The CPS consistency has two roles: encourage high similarity between the predictions of two perturbed networks for the same input image, and expand training data by using the unlabeled data with pseudo labels. Experiment results show that our approach achieves the state-of-the-art semi-supervised segmentation performance on Cityscapes and PASCAL VOC 2012.

[1]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[4]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jianping Shi,et al.  Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum , 2020, ArXiv.

[6]  Timo Aila,et al.  Semi-supervised semantic segmentation needs strong, high-dimensional perturbations , 2019 .

[7]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[8]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[10]  Sanja Fidler,et al.  Gated-SCNN: Gated Shape CNNs for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[12]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[14]  Chongruo Wu,et al.  Improving Semantic Segmentation via Self-Training , 2020, ArXiv.

[15]  Ming-Hsuan Yang,et al.  Adversarial Learning for Semi-supervised Semantic Segmentation , 2018, BMVC.

[16]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yang Zhao,et al.  Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[19]  Di Qiu,et al.  Guided Collaborative Training for Pixel-wise Semi-Supervised Learning , 2020, ECCV.

[20]  João Paulo Papa,et al.  Semi-supervised Segmentation Based on Error-Correcting Supervision , 2020, ECCV.

[21]  Concetto Spampinato,et al.  Semi Supervised Semantic Segmentation Using Generative Adversarial Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[24]  Stanley C. Fralick,et al.  Learning to recognize patterns without a teacher , 1967, IEEE Trans. Inf. Theory.

[25]  Xilin Chen,et al.  Object-Contextual Representations for Semantic Segmentation , 2019, ECCV.

[26]  Maxwell D. Collins,et al.  Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation , 2020, ArXiv.

[27]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.

[28]  Chun-Liang Li,et al.  PseudoSeg: Designing Pseudo Labels for Semantic Segmentation , 2020, ArXiv.

[29]  Rynson W. H. Lau,et al.  Dual Student: Breaking the Limits of the Teacher in Semi-Supervised Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Xilin Chen,et al.  SegFix: Model-Agnostic Boundary Refinement for Segmentation , 2020, ECCV.

[31]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Quoc V. Le,et al.  Rethinking Pre-training and Self-training , 2020, NeurIPS.

[33]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[34]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[35]  ASHOK K. AGRAWALA,et al.  Learning with a probabilistic teacher , 1970, IEEE Trans. Inf. Theory.

[36]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[37]  Kaiming He,et al.  PointRend: Image Segmentation As Rendering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Thomas Brox,et al.  Semi-Supervised Semantic Segmentation With High- and Low-Level Consistency , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Simon Fraser,et al.  Semi-Supervised Semantic Image Segmentation With Self-Correcting Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jooyoung Jang,et al.  Structured Consistency Loss for semi-supervised semantic segmentation , 2020, ArXiv.

[43]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  C. Hudelot,et al.  Semi-Supervised Semantic Segmentation With Cross-Consistency Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).