Fast Vision Transformers with HiLo Attention
暂无分享,去创建一个
[1] Jiwen Lu,et al. HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions , 2022, NeurIPS.
[2] Jing Zhang,et al. VSA: Learning Varied-Size Window Attention in Vision Transformers , 2022, ECCV.
[3] Errui Ding,et al. MixFormer: Mixing Features across Windows and Dimensions , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Chengrou Lu,et al. Visual attention network , 2022, Computational Visual Media.
[5] Songkuk Kim,et al. How Do Vision Transformers Work? , 2022, ICLR.
[6] Trevor Darrell,et al. A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] P. Milanfar,et al. MAXIM: Multi-Axis MLP for Image Processing , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Siyu Zhu,et al. QuadTree Attention for Vision Transformers , 2022, ICLR.
[9] S. Song,et al. Vision Transformer with Deformable Attention , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Mohammad Rastegari,et al. MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer , 2021, ICLR.
[11] Ross Wightman,et al. ResNet strikes back: An improved training procedure in timm , 2021, ArXiv.
[12] Lu Yuan,et al. Mobile-Former: Bridging MobileNet and Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Wanli Ouyang,et al. GLiT: Neural Architecture Search for Global and Local Image Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Nenghai Yu,et al. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Lu Yuan,et al. Focal Self-attention for Local-Global Interactions in Vision Transformers , 2021, ArXiv.
[16] Minghao Chen,et al. AutoFormer: Searching Transformers for Visual Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Jiwen Lu,et al. Global Filter Networks for Image Classification , 2021, NeurIPS.
[18] Trevor Darrell,et al. Early Convolutions Help Transformers See Better , 2021, NeurIPS.
[19] P. Luo,et al. PVT v2: Improved baselines with Pyramid Vision Transformer , 2021, Computational Visual Media.
[20] Matthijs Douze,et al. XCiT: Cross-Covariance Image Transformers , 2021, NeurIPS.
[21] Quoc V. Le,et al. CoAtNet: Marrying Convolution and Attention for All Data Sizes , 2021, NeurIPS.
[22] Zilong Huang,et al. Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer , 2021, ArXiv.
[23] Shijian Lu,et al. RDA: Robust Domain Adaptation via Fourier Adversarial Attacking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Jianfei Cai,et al. Less is More: Pay Less Attention in Vision Transformers , 2021, AAAI.
[25] Chunhua Shen,et al. Twins: Revisiting the Design of Spatial Attention in Vision Transformers , 2021, NeurIPS.
[26] Matthieu Cord,et al. Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] N. Codella,et al. CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[28] Lu Yuan,et al. Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[29] Fengwei Yu,et al. Incorporating Convolution Designs into Visual Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Bohan Zhuang,et al. Scalable Visual Transformers with Hierarchical Pooling , 2021, arXiv.org.
[31] Roy Schwartz,et al. Random Feature Attention , 2021, ICLR.
[32] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[33] Xiang Li,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Francis E. H. Tay,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[36] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[37] Lucy J. Colwell,et al. Rethinking Attention with Performers , 2020, ICLR.
[38] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[39] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[40] Tong Tong,et al. Guided Frequency Separation Network for Real-World Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[41] Tie-Yan Liu,et al. Invertible Image Rescaling , 2020, ECCV.
[42] Yuhao Wang,et al. Learning in the Frequency Domain , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Sen Jia,et al. How Much Position Information Do Convolutional Neural Networks Encode? , 2020, ICLR.
[44] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[45] Martin Jaggi,et al. On the Relationship between Self-Attention and Convolutional Layers , 2019, ICLR.
[46] Radu Timofte,et al. Frequency Separation for Real-World Super-Resolution , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[47] Kai Chen,et al. MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.
[48] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[49] Shuicheng Yan,et al. Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Kaiming He,et al. Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.
[52] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[53] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[54] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[55] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Bolei Zhou,et al. Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.
[57] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[58] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.
[59] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Yixin Chen,et al. Compressing Convolutional Neural Networks in the Frequency Domain , 2015, KDD.
[61] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[62] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[63] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[64] G. Deng,et al. An adaptive Gaussian filter for noise reduction and edge detection , 1993, 1993 IEEE Conference Record Nuclear Science Symposium and Medical Imaging Conference.
[65] E. Voigtman,et al. Low‐pass filters for signal averaging , 1986 .
[66] Peter D. Welch,et al. The Fast Fourier Transform and Its Applications , 1969 .
[67] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).