暂无分享,去创建一个
Yunfeng Cai | Tan Yu | Mingming Sun | Ping Li | Xu Li | Tan Yu | Ping Li | Yunfeng Cai | Mingming Sun | Xu Li
[1] Gregory Shakhnarovich,et al. FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.
[2] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[4] Torsten Hoefler,et al. Augment Your Batch: Improving Generalization Through Instance Repetition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[6] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[7] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[8] Yunfeng Cai,et al. S2-MLP: Spatial-Shift MLP Architecture for Vision , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[9] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[10] Long Zhao,et al. Aggregating Nested Transformers , 2021, ArXiv.
[11] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[12] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[13] Zilong Huang,et al. Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer , 2021, ArXiv.
[14] Quoc V. Le,et al. Pay Attention to MLPs , 2021, NeurIPS.
[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[16] Chunhua Shen,et al. Twins: Revisiting the Design of Spatial Attention in Vision Transformers , 2021, NeurIPS.
[17] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Luke Melas-Kyriazi,et al. Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet , 2021, ArXiv.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.
[22] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[23] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[24] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[25] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Kaiming He,et al. Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[28] Matthieu Cord,et al. ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[30] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[31] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[32] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Roozbeh Mottaghi,et al. Container: Context Aggregation Network , 2021, NeurIPS.
[34] Kurt Keutzer,et al. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[35] Seong Joon Oh,et al. Rethinking Spatial Dimensions of Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[36] Shi-Min Hu,et al. Beyond Self-Attention: External Attention Using Two Linear Layers for Visual Tasks , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Matthieu Cord,et al. Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[38] Marcel Worring,et al. 4-Connected Shift Residual Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[39] Zhe Gan,et al. Chasing Sparsity in Vision Transformers: An End-to-End Exploration , 2021, NeurIPS.
[40] N. Codella,et al. CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).