暂无分享,去创建一个
[1] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[2] Ameet Talwalkar,et al. Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.
[3] Hanrui Wang,et al. SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[4] Dacheng Tao,et al. Patch Slimming for Efficient Vision Transformers , 2021, ArXiv.
[5] Aude Oliva,et al. IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers , 2021, NeurIPS.
[6] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[7] Dacheng Tao,et al. Pruning Self-attentions into Convolutional Layers in Single Path , 2021, ArXiv.
[8] Deng Cai,et al. Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework , 2021, ICML.
[9] Dacheng Tao,et al. Efficient Vision Transformers via Fine-Grained Manifold Distillation , 2021, ArXiv.
[10] Weiming Dong,et al. Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer , 2021, AAAI.
[11] Dong Xu,et al. Multi-Dimensional Pruning: A Unified Framework for Model Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] C. Baker. Joint measures and cross-covariance operators , 1973 .
[13] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[14] Jianxin Wu,et al. A unified pruning framework for vision transformers , 2021, Science China Information Sciences.
[15] Glenn M. Fung,et al. Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention , 2021, AAAI.
[16] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[17] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[18] Jiwen Lu,et al. DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification , 2021, NeurIPS.
[19] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[20] Kai Han,et al. Visual Transformer Pruning , 2021, ArXiv.
[21] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[22] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[23] Zhe Gan,et al. Chasing Sparsity in Vision Transformers: An End-to-End Exploration , 2021, NeurIPS.
[24] Pavlo Molchanov,et al. NViT: Vision Transformer Compression and Parameter Redistribution , 2021, ArXiv.
[25] Liujuan Cao,et al. Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Bernhard Schölkopf,et al. Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.
[27] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Xiangyu Zhang,et al. Joint Multi-Dimension Pruning , 2020, ArXiv.
[29] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[30] Siwei Ma,et al. Post-Training Quantization for Vision Transformer , 2021, NeurIPS.
[31] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[32] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.
[33] Naiyan Wang,et al. Data-Driven Sparse Structure Selection for Deep Neural Networks , 2017, ECCV.