An Alternative to WSSS? An Empirical Study of the Segment Anything Model (SAM) on Weakly-Supervised Semantic Segmentation Problems

The Segment Anything Model (SAM) has demonstrated exceptional performance and versatility, making it a promising tool for various related tasks. In this report, we explore the application of SAM in Weakly-Supervised Semantic Segmentation (WSSS). Particularly, we adapt SAM as the pseudo-label generation pipeline given only the image-level class labels. While we observed impressive results in most cases, we also identify certain limitations. Our study includes performance evaluations on PASCAL VOC and MS-COCO, where we achieved remarkable improvements over the latest state-of-the-art methods on both datasets. We anticipate that this report encourages further explorations of adopting SAM in WSSS, as well as wider real-world applications.

[1]  Jun-Juan Zhu,et al.  Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection , 2023, ArXiv.

[2]  Xiaofei He,et al.  CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Pengxu Wei,et al.  Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Zequn Jie,et al.  Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation , 2022, NeurIPS.

[5]  Yunchao Wei,et al.  L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Seong Joon Oh,et al.  Weakly Supervised Semantic Segmentation using Out-of-Distribution Data , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Lingxiao Yang,et al.  Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Wanli Ouyang,et al.  Multi-class Token Transformer for Weakly Supervised Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yibing Zhan,et al.  Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers , 2022, Computer Vision and Pattern Recognition.

[10]  Xiansheng Hua,et al.  Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Chen Wu,et al.  Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling , 2022, International Journal of Computer Vision.

[12]  S. Hoi,et al.  BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.

[13]  Wayne Zhang,et al.  Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation , 2021, AAAI.

[14]  N. Barnes,et al.  GETAM: Gradient-weighted Element-wise Transformer Attention Map for Weakly-supervised Semantic segmentation , 2021, ArXiv.

[15]  Nick Barnes,et al.  Inferring the Class Conditional Response Map for Weakly Supervised Semantic Segmentation , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[16]  Yunhong Wang,et al.  Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jungbeom Lee,et al.  Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation , 2021, NeurIPS.

[18]  Kuk-Jin Yoon,et al.  Unlocking the Potential of Ordinary Classifier: Class-specific Adversarial Erasing Framework for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Haoqing Shi,et al.  ECS-Net: Improving Weakly Supervised Semantic Segmentation by Using Connections Between Class Activation Maps , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Zhanghui Kuang,et al.  Pseudo-mask Matters in Weakly-supervised Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Yuchao Dai,et al.  Complementary Patch for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Mohammed Bennamoun,et al.  Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Xiaoming Wei,et al.  Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jongwuk Lee,et al.  Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Qi Wu,et al.  Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Sungroh Yoon,et al.  BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Sungroh Yoon,et al.  Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Junmo Kim,et al.  Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation , 2021, AAAI.

[29]  Guosheng Lin,et al.  Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[31]  Nick Barnes,et al.  3D Guided Weakly Supervised Semantic Segmentation , 2020, ACCV.

[32]  Tieniu Tan,et al.  Employing Multi-estimations for Weakly-Supervised Semantic Segmentation , 2020, ECCV.

[33]  Jinhui Tang,et al.  Causal Intervention for Weakly-Supervised Semantic Segmentation , 2020, NeurIPS.

[34]  Xilin Chen,et al.  Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yunchao Wei,et al.  Integral Object Mining via Online Attention Accumulation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Yan Huang,et al.  Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Suha Kwak,et al.  Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Huimin Ma,et al.  Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Suha Kwak,et al.  Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Ismail Ben Ayed,et al.  On Regularized Losses for Weakly-supervised CNN Segmentation , 2018, ECCV.

[41]  Paul Vernaza,et al.  Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Yao Zhao,et al.  Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Jian Sun,et al.  ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[48]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Trevor Darrell,et al.  Fully Convolutional Multi-Class Multiple Instance Learning , 2014, ICLR.

[50]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  DeepMind,et al.  What's on where , 1994, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[52]  Alex ChiChung Kot,et al.  Splitting Vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation , 2020, ECCV.

[53]  Cristian Sminchisescu,et al.  Semantic Segmentation with , 2012 .