暂无分享,去创建一个
[1] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.
[2] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Frank Hutter,et al. Fixing Weight Decay Regularization in Adam , 2017, ArXiv.
[4] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[6] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[7] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[9] Nenghai Yu,et al. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Matthijs Douze,et al. XCiT: Cross-Covariance Image Transformers , 2021, NeurIPS.
[11] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[12] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[13] Chunhua Shen,et al. Twins: Revisiting the Design of Spatial Attention in Vision Transformers , 2021, NeurIPS.
[14] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[15] Kaiming He,et al. Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Shuicheng Yan,et al. Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Long Zhao,et al. Aggregating Nested Transformers , 2021, ArXiv.
[19] Xinggang Wang,et al. MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens , 2021, ArXiv.
[20] Matthieu Cord,et al. Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[21] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[22] Josef Kittler,et al. SiT: Self-supervised vIsion Transformer , 2021, ArXiv.
[23] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[24] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Yang Song,et al. The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[26] Quanfu Fan,et al. RegionViT: Regional-to-Local Attention for Vision Transformers , 2021, ArXiv.
[27] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[28] Yunfeng Cai,et al. S2-MLP: Spatial-Shift MLP Architecture for Vision , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[29] Ding Liang,et al. CycleMLP: A MLP-like Architecture for Dense Prediction , 2021, ArXiv.
[30] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[31] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[32] Jianfeng Gao,et al. Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks , 2020, ECCV.
[33] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[34] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[35] Matthieu Cord,et al. ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Kai Han,et al. Hire-MLP: Vision MLP via Hierarchical Rearrangement , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[38] Yunfeng Cai,et al. Rethinking Token-Mixing MLP for MLP-based Vision Backbone , 2021, BMVC.
[39] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.