OMG-Seg: Is One Model Good Enough For All Segmentation?
暂无分享,去创建一个
Wenwei Zhang | Haobo Yuan | Size Wu | Chen Change Loy | Xiangtai Li | Yining Li | Kai Chen | Wei Li | Henghui Ding
[1] Haobo Yuan,et al. RAP-SAM: Towards Real-Time All-Purpose Segment Anything , 2024, ArXiv.
[2] Chong Zhou,et al. Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively , 2024, ArXiv.
[3] Xinshun Wang,et al. Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning , 2023, ArXiv.
[4] Xinyang Geng,et al. Sequential Modeling Enables Scalable Learning for Large Vision Models , 2023, ArXiv.
[5] Hao Zhou,et al. Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion , 2023, ArXiv.
[6] Wenwei Zhang,et al. DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection , 2023, ArXiv.
[7] Wenwei Zhang,et al. CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction , 2023, ArXiv.
[8] Chen Change Loy,et al. MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Liang-Chieh Chen,et al. Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP , 2023, NeurIPS.
[10] Pei Sun,et al. Semantic-SAM: Segment and Recognize Anything at Any Granularity , 2023, ArXiv.
[11] Trevor Darrell,et al. Hierarchical Open-vocabulary Universal Image Segmentation , 2023, NeurIPS.
[12] Bernard Ghanem,et al. Towards Open Vocabulary Learning: A Survey , 2023, IEEE transactions on pattern analysis and machine intelligence.
[13] Li Dong,et al. Kosmos-2: Grounding Multimodal Large Language Models to the World , 2023, ArXiv.
[14] Chen Change Loy,et al. Explore In-Context Learning for 3D Point Cloud Understanding , 2023, NeurIPS.
[15] Jiannan Wu,et al. VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks , 2023, NeurIPS.
[16] Chen Change Loy,et al. Transformer-Based Visual Segmentation: A Survey , 2023, IEEE transactions on pattern analysis and machine intelligence.
[17] Yong Jae Lee,et al. Segment Everything Everywhere All at Once , 2023, NeurIPS.
[18] Chunhua Shen,et al. SegGPT: Segmenting Everything In Context , 2023, ArXiv.
[19] Ross B. Girshick,et al. Segment Anything , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Rui Wang,et al. FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Chen Change Loy,et al. Correlational Image Modeling for Self-Supervised Visual Pre-Training , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] A. Torralba,et al. Detecting Everything in the Open World: Towards Universal Object Detection , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] A. Torralba,et al. Open-vocabulary Panoptic Segmentation with Embedding Modulation , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Jianfeng Gao,et al. A Simple Framework for Open-Vocabulary Segmentation and Detection , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Jiannan Wu,et al. Universal Instance Perception as Object Discovery and Retrieval , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Shalini De Mello,et al. Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Philip H. S. Torr,et al. MOSE: A New Dataset for Video Object Segmentation in Complex Scenes , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[28] Chen Change Loy,et al. Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[29] Yong Jae Lee,et al. Generalized Decoding for Pixel, Image, and Language , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Chunhua Shen,et al. Images Speak in Images: A Generalist Painter for In-Context Visual Learning , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Ledell Yu Wu,et al. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Humphrey Shi,et al. OneFormer: One Transformer to Rule Universal Image Segmentation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Chang Liu,et al. VLT: Vision-Language Transformer and Query Generation for Referring Segmentation , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] A. Piergiovanni,et al. F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models , 2022, ArXiv.
[35] Alexei A. Efros,et al. Visual Prompting via Image Inpainting , 2022, NeurIPS.
[36] Anima Anandkumar,et al. MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training , 2022, NeurIPS.
[37] A. Yuille,et al. In Defense of Online Models for Video Instance Segmentation , 2022, ECCV.
[38] Aniruddha Kembhavi,et al. Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks , 2022, ICLR.
[39] Chen Change Loy,et al. Masked Frequency Modeling for Self-Supervised Visual Pre-Training , 2022, ICLR.
[40] David J. Fleet,et al. A Unified Sequence Interface for Vision Tasks , 2022, NeurIPS.
[41] Yunchao Wei,et al. Large-scale Video Panoptic Segmentation in the Wild: A Benchmark , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Liang-Chieh Chen,et al. TubeFormer-DeepLab: Video Mask Transformer , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Jifeng Dai,et al. Vision Transformer Adapter for Dense Predictions , 2022, ICLR.
[44] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[45] Chen Change Loy,et al. Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] D. Tao,et al. Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation , 2022, ECCV.
[47] Chen Change Loy,et al. Open-Vocabulary DETR with Conditional Matching , 2022, ECCV.
[48] S. Hoi,et al. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.
[49] Trevor Darrell,et al. A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Kilian Q. Weinberger,et al. Language-driven Semantic Segmentation , 2022, ICLR.
[51] Armand Joulin,et al. Detecting Twenty-thousand Classes using Image-level Supervision , 2022, ECCV.
[52] James Hays,et al. MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[53] Yin Cui,et al. Scaling Open-Vocabulary Image Segmentation with Image-Level Labels , 2021, ECCV.
[54] Alexander G. Schwing,et al. Mask2Former for Video Instance Segmentation , 2021, ArXiv.
[55] S. Bai,et al. SeqFormer: Sequential Transformer for Video Instance Segmentation , 2021, ECCV.
[56] A. Schwing,et al. Masked-attention Mask Transformer for Universal Image Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] David J. Fleet,et al. Pix2seq: A Language Modeling Framework for Object Detection , 2021, ICLR.
[59] Alexander G. Schwing,et al. Per-Pixel Classification is Not All You Need for Semantic Segmentation , 2021, NeurIPS.
[60] David J. Crandall,et al. A Survey on Deep Learning Technique for Video Segmentation , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[61] Kai Chen,et al. K-Net: Towards Unified Image Segmentation , 2021, NeurIPS.
[62] Jiaxu Miao,et al. VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Yin Cui,et al. Open-vocabulary Object Detection via Vision and Language Knowledge Distillation , 2021, ICLR.
[64] Du Tran,et al. Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[65] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[66] A. Yuille,et al. MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Chunhua Shen,et al. End-to-End Video Instance Segmentation with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Shih-Fu Chang,et al. Open-Vocabulary Object Detection Using Captions , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[69] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[70] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[71] Jianping Shi,et al. Improving Semantic Segmentation via Decoupled Body and Edge Supervision , 2020, ECCV.
[72] A. Yuille,et al. DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[73] In So Kweon,et al. Video Panoptic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[75] Kuiyuan Yang,et al. Semantic Flow for Fast and Accurate Scene Parsing , 2020, ECCV.
[76] Antonio J. Plaza,et al. Image Segmentation Using Deep Learning: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[77] Yuning Jiang,et al. SOLO: Segmenting Objects by Locations , 2019, ECCV.
[78] Jian Sun,et al. Objects365: A Large-Scale, High-Quality Dataset for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[79] Kai Chen,et al. MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.
[80] Yuchen Fan,et al. Video Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[81] Ning Xu,et al. Video Object Segmentation Using Space-Time Memory Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[82] Kai Chen,et al. Hybrid Task Cascade for Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[83] Kaiming He,et al. Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[84] Luc Van Gool,et al. The 2018 DAVIS Challenge on Video Object Segmentation , 2018, ArXiv.
[85] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[86] Carsten Rother,et al. Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[87] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.
[88] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[89] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[90] Bolei Zhou,et al. Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.
[91] Seyed-Ahmad Ahmadi,et al. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).
[92] Alan L. Yuille,et al. Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[93] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[94] Maxwell D. Collins,et al. k-means Mask Transformer , 2022, ECCV.
[95] Antonio Criminisi,et al. Object Class Segmentation using Random Forests , 2008, BMVC.
[96] Thomas K. Leung,et al. Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.
[97] Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation Supplementary , 2022 .