Can SAM Count Anything? An Empirical Study on SAM Counting

Meta AI recently released the Segment Anything model (SAM), which has garnered attention due to its impressive performance in class-agnostic segmenting. In this study, we explore the use of SAM for the challenging task of few-shot object counting, which involves counting objects of an unseen category by providing a few bounding boxes of examples. We compare SAM's performance with other few-shot counting methods and find that it is currently unsatisfactory without further fine-tuning, particularly for small and crowded objects. Code can be found at \url{https://github.com/Vision-Intelligence-and-Robots-Group/count-anything}.

[1]  Ross B. Girshick,et al.  Segment Anything , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Donghong Zheng,et al.  Accurate few-shot object counting with Hough matching feature enhancement , 2023, Frontiers in Computational Neuroscience.

[3]  Henrique Pondé de Oliveira Pinto,et al.  GPT-4 Technical Report , 2023, 2303.08774.

[4]  Jun-Juan Zhu,et al.  Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection , 2023, ECCV.

[5]  X. Lu,et al.  Few-shot Object Counting with Similarity-Aware Feature Enhancement , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[6]  Chen Feng,et al.  Represent, Compare, and Learn: A Similarity-Aware Framework for Class-Agnostic Counting , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Xiaopeng Hong,et al.  Object Counting: You Only Need to Look at One , 2021, ArXiv.

[8]  Minh Hoai,et al.  Learning To Count Everything , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[10]  Winston H. Hsu,et al.  Class-agnostic Few-shot Object Counting , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[12]  Yu-Wing Tai,et al.  Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xin Wang,et al.  Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Alexander S. Ecker,et al.  One-Shot Instance Segmentation , 2018, ArXiv.

[15]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[16]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.