Open-Vocabulary DETR with Conditional Matching
暂无分享,去创建一个
[1] Kaiyang Zhou,et al. Neural Prompt Search , 2022, ArXiv.
[2] Miaojing Shi,et al. Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Chen Change Loy,et al. Conditional Prompt Learning for Vision-Language Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Armand Joulin,et al. Detecting Twenty-thousand Classes using Image-level Supervision , 2022, ECCV.
[5] Chen Change Loy,et al. Learning to Prompt for Vision-Language Models , 2021, International Journal of Computer Vision.
[6] Dahun Kim,et al. Learning Open-World Object Proposals without Learning to Classify , 2021, IEEE Robotics and Automation Letters.
[7] Yin Cui,et al. Open-vocabulary Object Detection via Vision and Language Knowledge Distillation , 2021, ICLR.
[8] Depu Meng,et al. Conditional DETR for Fast Training Convergence , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Yann LeCun,et al. MDETR - Modulated Detection for End-to-End Multi-Modal Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[11] Peng Gao,et al. Fast Convergence of DETR with Spatially Modulated Co-Attention , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Shih-Fu Chang,et al. Open-Vocabulary Object Detection Using Captions , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Zhigang Dai,et al. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] L. Sigal,et al. An Improved Attention for Visual Question Answering , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[15] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[16] Junnan Li,et al. Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes , 2021, ArXiv.
[17] Shuai Zheng,et al. ZSD-YOLO: Zero-Shot YOLO Detection using Vision-Language KnowledgeDistillation , 2021, ArXiv.
[18] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[19] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[20] Liu Yang,et al. Sparse Sinkhorn Attention , 2020, ICML.
[21] Changxin Gao,et al. GTNet: Generative Transfer Network for Zero-Shot Object Detection , 2020, AAAI.
[22] Venkatesh Saligrama,et al. Don’t Even Look Once: Synthesizing Features for Zero-Shot Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[25] Lina Yao,et al. Zero-Shot Object Detection with Textual Descriptions , 2019, AAAI.
[26] Ross B. Girshick,et al. LVIS: A Dataset for Large Vocabulary Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Silvio Savarese,et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[29] Qi Wu,et al. Visual Grounding via Accumulated Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Rama Chellappa,et al. Zero-Shot Object Detection , 2018, ECCV.
[31] Ramakant Nevatia,et al. Query-Guided Regression Network with Context Policy for Phrase Grounding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Xiaogang Wang,et al. Person Search with Natural Language Description , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[36] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[37] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[38] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[39] Dumitru Erhan,et al. Deep Neural Networks for Object Detection , 2013, NIPS.
[40] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.
[41] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[42] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[43] Tomaso A. Poggio,et al. A Trainable System for Object Detection , 2000, International Journal of Computer Vision.