Generalized Few-Shot Semantic Segmentation: All You Need is Fine-Tuning

Generalized few-shot semantic segmentation was introduced to move beyond only evaluating few-shot segmentation models on novel classes to include testing their ability to remember base classes. While all approaches currently are based on meta-learning, they perform poorly and saturate in learning after observing only a few shots. We propose the first fine-tuning solution, and demonstrate that it addresses the saturation problem while achieving stateof-art results on two datasets, PASCAL-5 and COCO-20. We also show it outperforms existing methods whether finetuning multiple final layers or only the final layer. Finally, we present a triplet loss regularization that shows how to redistribute the balance of performance between novel and base categories so that there is a smaller gap between them.

[1]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Guosheng Lin,et al.  CRNet: Cross-Reference Networks for Few-Shot Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hao Xu,et al.  SCL-MLNet: Boosting Few-Shot Remote Sensing Scene Classification via Self-Supervised Contrastive Learning , 2022, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Yu Zhang,et al.  Momentum contrastive learning for few-shot COVID-19 diagnosis from chest CT images , 2020, Pattern Recognition.

[5]  Pablo Piantanida,et al.  Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Khoi Nguyen,et al.  Feature Weighting and Boosting for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Yi Zhang,et al.  PSANet: Point-wise Spatial Attention Network for Scene Parsing , 2018, ECCV.

[8]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[9]  J. Cui,et al.  Region-aware Contrastive Learning for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Xiantong Zhen,et al.  Few-Shot Semantic Segmentation with Democratic Attention Networks , 2020, ECCV.

[11]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Qingmin Liao,et al.  SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive Background Prototypes , 2021, ArXiv.

[13]  Martin Jägersand,et al.  AMP: Adaptive Masked Proxies for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Yi Yang,et al.  SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation , 2018, IEEE Transactions on Cybernetics.

[15]  Chi Zhang,et al.  Pyramid Graph Networks With Connection Attentions for Region-Based One-Shot Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Yue Wang,et al.  Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[17]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[18]  Myriam Tami,et al.  Spatial Contrastive Learning for Few-Shot Classification , 2020, ECML/PKDD.

[19]  Jiashi Feng,et al.  PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Siddhartha Gairola,et al.  SimPropNet: Improved Similarity Propagation for Few-shot Image Segmentation , 2020, IJCAI.

[23]  Li Jiang,et al.  Generalized Few-Shot Semantic Segmentation , 2020, ArXiv.

[24]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[25]  Jilin Li,et al.  Learning a Few-shot Embedding Model with Contrastive Learning , 2021, AAAI.

[26]  Subhransu Maji,et al.  Supervised Momentum Contrastive Learning for Few-Shot Classification , 2021 .

[27]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.

[28]  Gang Yu,et al.  Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation , 2019, AAAI.

[29]  Guizhong Liu,et al.  Few-Shot Image Classification via Contrastive Self-Supervised Learning , 2020, ArXiv.

[30]  Trevor Darrell,et al.  Frustratingly Simple Few-Shot Object Detection , 2020, ICML.

[31]  Jie Lin,et al.  Few-Shot Segmentation with Global and Local Contrastive Learning , 2021, ArXiv.

[32]  Alexei A. Efros,et al.  Conditional Networks for Few-Shot Semantic Segmentation , 2018, ICLR.

[33]  Hengshuang Zhao,et al.  Prior Guided Feature Enrichment Network for Few-Shot Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Xuming He,et al.  Part-aware Prototype Network for Few-shot Semantic Segmentation , 2020, ECCV.

[35]  Rui Yao,et al.  CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Byron Boots,et al.  One-Shot Learning for Semantic Segmentation , 2017, BMVC.

[37]  Hanno Gottschalk,et al.  The Ethical Dilemma When (Not) Setting up Cost-Based Decision Rules in Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Kurt Keutzer,et al.  Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[40]  Eric P. Xing,et al.  Few-Shot Semantic Segmentation with Prototype Learning , 2018, BMVC.

[41]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Stefano Soatto,et al.  A Baseline for Few-Shot Image Classification , 2019, ICLR.

[43]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[44]  Qixiang Ye,et al.  Prototype Mixture Models for Few-shot Semantic Segmentation , 2020, ECCV.

[45]  Wei Liu,et al.  ParseNet: Looking Wider to See Better , 2015, ArXiv.

[46]  Chi Zhang,et al.  FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[48]  Minsu Cho,et al.  Hypercorrelation Squeeze for Few-Shot Segmenation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Kun Yu,et al.  DenseASPP for Semantic Segmentation in Street Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.