Self-Guided and Cross-Guided Learning for Few-Shot Segmentation

Few-shot segmentation has been attracting a lot of attention due to its effectiveness to segment unseen object classes with a few annotated samples. Most existing approaches use masked Global Average Pooling (GAP) to encode an annotated support image to a feature vector to facilitate query image segmentation. However, this pipeline unavoidably loses some discriminative information due to the average operation. In this paper, we propose a simple but effective self-guided learning approach, where the lost critical information is mined. Specifically, through making an initial prediction for the annotated support image, the covered and uncovered foreground regions are encoded to the primary and auxiliary support vectors using masked GAP, respectively. By aggregating both primary and auxiliary support vectors, better segmentation performances are obtained on query images. Enlightened by our self-guided module for 1-shot segmentation, we propose a cross-guided module for multiple shot segmentation, where the final mask is fused using predictions from multiple annotated samples with high-quality support vectors contributing more and vice versa. This module improves the final prediction in the inference stage without re-training. Extensive experiments show that our approach achieves new state-of-the-art performances on both PASCAL-5i and COCO-20i datasets. Source code is available at https://github.com/zbf1991/SCL.

[1]  Shuai Chen,et al.  A New Local Transformation Module for Few-shot Segmentation , 2019, MMM.

[2]  Xuming He,et al.  Part-aware Prototype Network for Few-shot Semantic Segmentation , 2020, ECCV.

[3]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yinghuan Shi,et al.  Differentiable Meta-learning Model for Few-shot Semantic Segmentation , 2019, AAAI.

[5]  Guosheng Lin,et al.  CRNet: Cross-Reference Networks for Few-Shot Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yunchao Wei,et al.  CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Rui Yao,et al.  CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Byron Boots,et al.  One-Shot Learning for Semantic Segmentation , 2017, BMVC.

[9]  Khoi Nguyen,et al.  Feature Weighting and Boosting for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Siddhartha Gairola,et al.  SimPropNet: Improved Similarity Propagation for Few-shot Image Segmentation , 2020, IJCAI.

[13]  Chi-Keung Tang,et al.  FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jose Dolz,et al.  On the Texture Bias for Few-Shot CNN Segmentation , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[16]  Qixiang Ye,et al.  Prototype Mixture Models for Few-shot Semantic Segmentation , 2020, ECCV.

[17]  Martin Jägersand,et al.  AMP: Adaptive Masked Proxies for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[21]  Alexei A. Efros,et al.  Conditional Networks for Few-Shot Semantic Segmentation , 2018, ICLR.

[22]  Hengshuang Zhao,et al.  Prior Guided Feature Enrichment Network for Few-Shot Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jianping Shi,et al.  Improving Semantic Segmentation via Decoupled Body and Edge Supervision , 2020, ECCV.

[24]  Zheng-Jun Zha,et al.  Self-Supervised Tuning for Few-Shot Segmentation , 2020, IJCAI.

[25]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[26]  Yi Yang,et al.  SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation , 2018, IEEE Transactions on Cybernetics.

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Chi Zhang,et al.  Pyramid Graph Networks With Connection Attentions for Region-Based One-Shot Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Jiashi Feng,et al.  PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Gang Yu,et al.  Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation , 2019, AAAI.

[31]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Mingjie Sun,et al.  Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach , 2019, AAAI.

[33]  Eric P. Xing,et al.  Few-Shot Semantic Segmentation with Prototype Learning , 2018, BMVC.

[34]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[35]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[36]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[37]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[38]  Xiantong Zhen,et al.  Few-Shot Semantic Segmentation with Democratic Attention Networks , 2020, ECCV.