暂无分享,去创建一个
Stephen Lin | Baining Guo | Yutong Lin | Zheng Zhang | Yue Cao | Han Hu | Ze Liu | Yixuan Wei | Stephen Lin | B. Guo | Han Hu | Yue Cao | Zheng Zhang | Yixuan Wei | Yutong Lin | Ze Liu | B. Guo
[1] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[3] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[4] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.
[5] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[6] Cewu Lu,et al. InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Cheng Chi,et al. RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder , 2020, NeurIPS.
[8] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[9] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[10] Yuning Jiang,et al. Unified Perceptual Parsing for Scene Understanding , 2018, ECCV.
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Larry S. Davis,et al. SNIPER: Efficient Multi-Scale Training , 2018, NeurIPS.
[15] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[16] Kai Chen,et al. Hybrid Task Cascade for Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Stephen Lin,et al. Local Relation Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Chongruo Wu,et al. ResNeSt: Split-Attention Networks , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[19] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[20] Larry S. Davis,et al. Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Kaiming He,et al. Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Matthieu Cord,et al. ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[24] Eric Tzeng,et al. Toward Transformer-Based Object Detection , 2020, ArXiv.
[25] Jun Fu,et al. Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Lucas Beyer,et al. Big Transfer (BiT): General Visual Representation Learning , 2020, ECCV.
[27] Larry S. Davis,et al. An Analysis of Scale Invariance in Object Detection - SNIP , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Zheng Zhang,et al. Disentangled Non-Local Neural Networks , 2020, ECCV.
[30] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[31] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Vladlen Koltun,et al. Exploring Self-Attention for Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Quoc V. Le,et al. EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Pieter Abbeel,et al. Bottleneck Transformers for Visual Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[36] Zheng Zhang,et al. RepPoints V2: Verification Meets Regression for Object Detection , 2020, NeurIPS.
[37] Lukasz Kaiser,et al. Rethinking Attention with Performers , 2020, ArXiv.
[38] Alan Yuille,et al. DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution , 2020, ArXiv.
[39] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[40] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[41] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Quoc V. Le,et al. Attention Augmented Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[43] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[44] Jingdong Wang,et al. OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.
[45] Yi Jiang,et al. Sparse R-CNN: End-to-End Object Detection with Learnable Proposals , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Liu Yang,et al. Long Range Arena: A Benchmark for Efficient Transformers , 2020, ICLR.
[47] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[48] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[49] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[50] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.
[51] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[52] Bolei Zhou,et al. Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.
[53] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[54] Stephen Lin,et al. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[55] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[56] Tao Xiang,et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Bo Zhang,et al. Do We Really Need Explicit Position Encodings for Vision Transformers? , 2021, ArXiv.
[58] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[59] Jianfeng Gao,et al. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.
[60] Yi Li,et al. Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[61] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[62] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Yoshua Bengio,et al. Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.
[64] Quoc V. Le,et al. SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[66] Nuno Vasconcelos,et al. Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[67] Xilin Chen,et al. Object-Contextual Representations for Semantic Segmentation , 2019, ECCV.
[68] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[69] Torsten Hoefler,et al. Augment Your Batch: Improving Generalization Through Instance Repetition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[70] Stephen Lin,et al. Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Shifeng Zhang,et al. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Yichen Wei,et al. Learning Region Features for Object Detection , 2018, ECCV.
[73] Quoc V. Le,et al. Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Kai Chen,et al. MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.
[75] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[76] Kunihiko Fukushima,et al. Cognitron: A self-organizing multilayered neural network , 1975, Biological Cybernetics.
[77] Hong-Yuan Mark Liao,et al. YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.
[78] Yang Zhao,et al. Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[79] Jun Fu,et al. Adaptive Context Network for Scene Parsing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[80] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[81] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[82] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.