DETRs with Hybrid Matching
暂无分享,去创建一个
Xiao-pei Wu | Chao Zhang | Yuhui Yuan | Haojun Yu | Hanhua Hu | Ding Jia | Weihong Lin | Lei-huan Sun | Hao He
[1] A. Yuille,et al. MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models , 2022, ICLR.
[2] Maxwell D. Collins,et al. k-means Mask Transformer , 2022, ECCV.
[3] Bailan Feng,et al. CF-DETR: Coarse-to-Fine Transformers for End-to-End Object Detection , 2022, AAAI Conference on Artificial Intelligence.
[4] Xiangyu Zhang,et al. Anchor DETR: Query Design for Transformer-Based Detector , 2022, AAAI.
[5] H. Shum,et al. Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Jian Sun,et al. PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images , 2022, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Dahu Shi,et al. End-to-End Multi-Person Pose Estimation with Transformers , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] B. Uzkent,et al. Lite-MDETR: A Lightweight Multi-Modal Detector , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Maxwell D. Collins,et al. CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Daniel Y. Fu,et al. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness , 2022, NeurIPS.
[11] Kaicheng Yu,et al. BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework , 2022, NeurIPS.
[12] Huizi Mao,et al. BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).
[13] Z. Tu,et al. Text Spotting Transformers , 2022, Computer Vision and Pattern Recognition.
[14] Junjun Jiang,et al. BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation , 2022, ArXiv.
[15] Jifeng Dai,et al. BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers , 2022, ECCV.
[16] Junjie Huang,et al. BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection , 2022, ArXiv.
[17] Ross B. Girshick,et al. Exploring Plain Vision Transformer Backbones for Object Detection , 2022, ECCV.
[18] Limin Wang,et al. AdaMixer: A Fast-Converging Query-Based Object Detector , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] A. Bissacco,et al. Towards End-to-End Unified Scene Text Detection and Layout Analysis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Xianming Liu,et al. DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation , 2022, Machine Intelligence Research.
[21] Chiew-Lan Tai,et al. TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Jing Zhang,et al. Towards Data-Efficient Detection Transformers , 2022, ECCV.
[23] Shijian Lu,et al. Accelerating DETR Convergence via Semantic-Aligned Matching , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Jian Sun,et al. PETR: Position Embedding Transformation for Multi-View 3D Object Detection , 2022, ECCV.
[25] H. Shum,et al. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection , 2022, ICLR.
[26] L. Ni,et al. DN-DETR: Accelerate DETR Training by Introducing Query DeNoising , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] J. Zelek,et al. Arbitrary Shape Text Detection using Transformers , 2022, 2022 26th International Conference on Pattern Recognition (ICPR).
[28] Hang Su,et al. DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR , 2022, ICLR.
[29] Trevor Darrell,et al. A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Armand Joulin,et al. Detecting Twenty-thousand Classes using Image-level Supervision , 2022, ECCV.
[31] Jiannan Wu,et al. Language as Queries for Referring Video Object Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Wenming Tan,et al. SOIT: Segmenting Objects with Instance-Aware Transformers , 2021, AAAI.
[33] Philip H. S. Torr,et al. LAVT: Language-Aware Vision Transformer for Referring Image Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] A. Schwing,et al. Masked-attention Mask Transformer for Universal Image Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] J. Malik,et al. MViTv2: Improved Multiscale Vision Transformers for Classification and Detection , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Saehoon Kim,et al. Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity , 2021, ICLR.
[37] Li Dong,et al. Swin Transformer V2: Scaling Up Capacity and Resolution , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Anima Anandkumar,et al. Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Haibin Ling,et al. CBNet: A Composite Backbone Network Architecture for Object Detection , 2021, IEEE Transactions on Image Processing.
[40] X. Zhang,et al. MOTR: End-to-End Multiple-Object Tracking with TRansformer , 2021, ECCV.
[41] L. Leal-Taixé,et al. TrackFormer: Multi-Object Tracking with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Sinan Kalkan,et al. One Metric to Measure Them All: Localisation Recall Precision (LRP) for Evaluating Visual Detection Tasks , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[43] Gang Zeng,et al. Group DETR: Fast Training Convergence with Decoupled One-to-Many Label Assignment , 2022, ArXiv.
[44] Dalong Du,et al. BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View , 2021, ArXiv.
[45] Alexander G. Schwing,et al. Mask2Former for Video Instance Segmentation , 2021, ArXiv.
[46] Shuicheng Yan,et al. Direct Multi-view Multi-person 3D Pose Estimation , 2021, NeurIPS.
[47] Yilun Wang,et al. DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries , 2021, CoRL.
[48] Lu Yuan,et al. Dynamic DETR: End-to-End Object Detection with Dynamic Attention , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[49] Nikita Kister,et al. The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Rohit Girdhar,et al. An End-to-End Transformer Model for 3D Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[51] Depu Meng,et al. Conditional DETR for Fast Training Convergence , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[52] Yichen Wei,et al. SOLQ: Segmenting Objects by Learning Queries , 2021, NeurIPS.
[53] John S. Zelek,et al. Transformer-based Text Detection in the Wild , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[54] Masayuki Inaba,et al. TrTr: Visual Tracking with Transformer , 2021, ArXiv.
[55] Xinggang Wang,et al. Instances as Queries , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[56] Yann LeCun,et al. MDETR - Modulated Detection for End-to-End Multi-Modal Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[57] Zhuowen Tu,et al. Pose Recognition with Cascade Transformers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Shu-Tao Xia,et al. TokenPose: Learning Keypoint Tokens for Human Pose Estimation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[59] Boxun Li,et al. Efficient DETR: Improving End-to-End Object Detector with Dense Prior , 2021, ArXiv.
[60] Zheng Zhang,et al. Group-Free 3D Object Detection via Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[61] Jianlong Fu,et al. Learning Spatio-Temporal Transformer for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[62] Huchuan Lu,et al. Transformer Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Chunhua Shen,et al. TFPose: Direct Human Pose Estimation with Transformers , 2021, ArXiv.
[64] Alexander Mathis,et al. End-to-End Trainable Multi-Instance Pose Estimation with Transformers , 2021, ArXiv.
[65] Jason J. Corso,et al. Depth from Camera Motion and Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Peng Gao,et al. Fast Convergence of DETR with Spatially Modulated Co-Attention , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[67] Z. Tu,et al. Line Segment Detection Using Transformers without Edges , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Quoc V. Le,et al. Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Nanning Zheng,et al. End-to-End Object Detection with Fully Convolutional Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[70] A. Yuille,et al. MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[72] Song Bai,et al. SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation , 2021, ArXiv.
[73] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[74] P. Luo,et al. TransTrack: Multiple-Object Tracking with Transformer , 2020, ArXiv.
[75] Jian Sun,et al. AutoAssign: Differentiable Label Assignment for Dense Object Detection , 2020, ArXiv.
[76] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[77] Shifeng Zhang,et al. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[78] Jongyoul Park,et al. CenterMask: Real-Time Anchor-Free Instance Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[79] Xilin Chen,et al. Object-Contextual Representations for Semantic Segmentation , 2019, ECCV.
[80] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[82] Ross B. Girshick,et al. LVIS: A Dataset for Large Vocabulary Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[83] Hao Chen,et al. FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[84] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[85] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[86] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[87] Stefan Roth,et al. MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.
[88] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[89] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[90] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[91] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[92] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[93] W. Marsden. I and J , 2012 .
[94] Luc Van Gool,et al. Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).