暂无分享,去创建一个
Yanzhi Wang | Xiaolong Ma | Wei Niu | Zhenglun Kong | Minghai Qin | Bin Ren | Hao Tang | Peiyan Dong | Mengshu Sun | Xin Meng | Xiaolong Ma | Minghai Qin | Wei Niu | Peiyan Dong | Yanzhi Wang | Mengshu Sun | Zhenglun Kong | Bin Ren | Hao Tang | Xin Meng
[1] Matthijs Douze,et al. LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[2] Huchuan Lu,et al. Transformer Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Dacheng Tao,et al. Patch Slimming for Efficient Vision Transformers , 2021, ArXiv.
[4] C. Lawrence Zitnick,et al. Generative Adversarial Transformers , 2021, ICML.
[5] Nicu Sebe,et al. Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[7] Tao Xiang,et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Wanli Ouyang,et al. PSViT: Better Vision Transformer via Token Pooling and Attention Sharing , 2021, ArXiv.
[9] Aude Oliva,et al. IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers , 2021, NeurIPS.
[10] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[11] Jianfei Cai,et al. Scalable Vision Transformers with Hierarchical Pooling , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Ivan Laptev,et al. Training Vision Transformers for Image Retrieval , 2021, ArXiv.
[13] Eric Sommerlade,et al. ATS: Adaptive Token Sampling For Efficient Vision Transformers , 2021, ArXiv.
[14] Dacheng Tao,et al. Efficient Vision Transformers via Fine-Grained Manifold Distillation , 2021, ArXiv.
[15] Junying Chen,et al. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Ji Li,et al. Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning , 2020, FINDINGS.
[17] Fengwei Yu,et al. Incorporating Convolution Designs into Visual Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Kaiming He,et al. Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Sven Behnke,et al. T6D-Direct: Transformers for Multi-Object 6D Pose Direct Regression , 2021, GCPR.
[20] Cho-Jui Hsieh,et al. When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations , 2021, ArXiv.
[21] Ralph R. Martin,et al. PCT: Point cloud transformer , 2020, Computational Visual Media.
[22] Nicu Sebe,et al. AniFormer: Data-driven 3D Animation with Transformer , 2021, BMVC.
[23] Alexey Dosovitskiy,et al. Do Vision Transformers See Like Convolutional Neural Networks? , 2021, ArXiv.
[24] Jianlong Fu,et al. Learning Spatio-Temporal Transformer for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Roozbeh Mottaghi,et al. Container: Context Aggregation Network , 2021, NeurIPS.
[26] Zhe Gan,et al. Chasing Sparsity in Vision Transformers: An End-to-End Exploration , 2021, NeurIPS.
[27] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[28] Jiwen Lu,et al. DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification , 2021, NeurIPS.
[29] Zhuowen Tu,et al. Co-Scale Conv-Attentional Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[31] Quanfu Fan,et al. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[33] Wen Gao,et al. Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] N. Codella,et al. CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Kurt Keutzer,et al. You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[36] Ting Chen,et al. Pix2seq: A Language Modeling Framework for Object Detection , 2021, ArXiv.
[37] Jiaya Jia,et al. Exploring and Improving Mobile Level Vision Transformers , 2021, ArXiv.
[38] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[39] Yan Peng,et al. Dual-stream Network for Visual Recognition , 2021, ArXiv.
[40] Hanrui Wang,et al. SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[41] Geoffrey E. Hinton,et al. Similarity of Neural Network Representations Revisited , 2019, ICML.
[42] Juncheng Li,et al. Efficient Transformer for Single Image Super-Resolution , 2021, ArXiv.
[43] Seong Joon Oh,et al. Rethinking Spatial Dimensions of Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[44] Matthijs Douze,et al. XCiT: Cross-Covariance Image Transformers , 2021, NeurIPS.
[45] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[46] Xiaojie Jin,et al. Refiner: Refining Self-attention for Vision Transformers , 2021, ArXiv.
[47] Guodong Guo,et al. TransFER: Learning Relation-aware Facial Expression Representations with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[48] Alexander Kolesnikov,et al. Scaling Vision Transformers , 2021, ArXiv.
[49] Kai Han,et al. Visual Transformer Pruning , 2021, ArXiv.
[50] Pieter Abbeel,et al. Bottleneck Transformers for Visual Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Furu Wei,et al. BEiT: BERT Pre-Training of Image Transformers , 2021, ArXiv.
[52] Weiming Dong,et al. Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer , 2021, AAAI.
[53] Hongyang Chao,et al. Rethinking and Improving Relative Position Encoding for Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[54] Stephen Lin,et al. Instance Localization for Self-supervised Detection Pretraining , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Luc Van Gool,et al. MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation , 2021, ArXiv.
[56] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[57] Nicu Sebe,et al. Efficient Training of Visual Transformers with Small-Size Datasets , 2021, ArXiv.
[58] Yingda Xia,et al. Glance-and-Gaze Vision Transformer , 2021, NeurIPS.
[59] Baining Guo,et al. Learning Texture Transformer Network for Image Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Kurt Keutzer,et al. Visual Transformers: Token-based Image Representation and Processing for Computer Vision , 2020, ArXiv.
[61] Minyi Guo,et al. Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[62] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[63] Jiayu Li,et al. ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers , 2018, ASPLOS.
[64] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[65] Lior Wolf,et al. Transformer Interpretability Beyond Attention Visualization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[67] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[68] Julian Martin Eisenschlos,et al. SoftSort: A Continuous Relaxation for the argsort Operator , 2020, ICML.
[69] Lorenzo Bruzzone,et al. Looking Outside the Window: Wider-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images , 2021, ArXiv.
[70] Klaus Dietmayer,et al. Point Transformer , 2020, IEEE Access.
[71] Xiaojie Jin,et al. All Tokens Matter: Token Labeling for Training Better Vision Transformers , 2021, NeurIPS.
[72] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[73] Eun-Sol Kim,et al. HOTR: End-to-End Human-Object Interaction Detection with Transformers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Rong Jin,et al. KVT: k-NN Attention for Boosting Vision Transformers , 2021, ArXiv.
[75] Michael S. Ryoo,et al. TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? , 2021, ArXiv.
[76] Dahua Lin,et al. Vision Transformer with Progressive Sampling , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[77] Alexander Kolesnikov,et al. How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers , 2021, ArXiv.
[78] Alexander G. Schwing,et al. Per-Pixel Classification is Not All You Need for Semantic Segmentation , 2021, NeurIPS.
[79] Jianxin Wu,et al. A unified pruning framework for vision transformers , 2021, Science China Information Sciences.
[80] Alexander M. Rush,et al. Movement Pruning: Adaptive Sparsity by Fine-Tuning , 2020, NeurIPS.
[81] Minghao Chen,et al. AutoFormer: Searching Transformers for Visual Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[82] Laura Leal-Taixe,et al. TrackFormer: Multi-Object Tracking with Transformers , 2021, ArXiv.
[83] Wengang Zhou,et al. TransVG: End-to-End Visual Grounding with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).