暂无分享,去创建一个
Zhe Gan | Lu Yuan | Zhangyang Wang | Lei Zhang | Tianlong Chen | Yu Cheng | Zhangyang Wang | Lu Yuan | Zhe Gan | Lei Zhang | Yu Cheng | Tianlong Chen
[1] Jianfeng Gao,et al. Unified Vision-Language Pre-Training for Image Captioning and VQA , 2020, AAAI.
[2] Xin Wang,et al. Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization , 2019, ICML.
[3] Mattan Erez,et al. PruneTrain: fast neural network training by dynamic sparse model reconfiguration , 2019, SC.
[4] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[5] Zhe Gan,et al. Playing Lottery Tickets with Vision and Language , 2021, AAAI.
[6] Shiyu Chang,et al. TransGAN: Two Transformers Can Make One Strong GAN , 2021, ArXiv.
[7] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[8] Shiyu Chang,et al. The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[10] Chunhua Shen,et al. End-to-End Video Instance Segmentation with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Mykola Pechenizkiy,et al. Sparse evolutionary deep learning with over one million artificial neurons on commodity hardware , 2019, Neural Computing and Applications.
[12] Lu Yuan,et al. Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding , 2021, ArXiv.
[13] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[14] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[15] Hongyang Chao,et al. Learning Joint Spatial-Temporal Transformations for Video Inpainting , 2020, ECCV.
[16] Gintare Karolina Dziugaite,et al. The Lottery Ticket Hypothesis at Scale , 2019, ArXiv.
[17] Antonio Liotta,et al. A topological insight into restricted Boltzmann machines , 2016, Machine Learning.
[18] Cho-Jui Hsieh,et al. VisualBERT: A Simple and Performant Baseline for Vision and Language , 2019, ArXiv.
[19] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[20] Ling Shao,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, ArXiv.
[21] Tianlong Chen,et al. GANs Can Play Lottery Tickets Too , 2021, ICLR.
[22] Yongqiang Lyu,et al. SNrram: An Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[23] Aurko Roy,et al. Efficient Content-Based Sparse Attention with Routing Transformers , 2021, TACL.
[24] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[25] Xiaogang Wang,et al. End-to-End Object Detection with Adaptive Clustering Transformer , 2020, BMVC.
[26] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[27] Steve B. Furber,et al. Memory-Efficient Deep Learning on a SpiNNaker 2 Prototype , 2018, Front. Neurosci..
[28] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[29] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ArXiv.
[30] Roger B. Grosse,et al. Picking Winning Tickets Before Training by Preserving Gradient Flow , 2020, ICLR.
[31] Erich Elsen,et al. Rigging the Lottery: Making All Tickets Winners , 2020, ICML.
[32] Yu Cheng,et al. UNITER: UNiversal Image-TExt Representation Learning , 2019, ECCV.
[33] Zhe Gan,et al. EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets , 2020, ACL.
[34] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[35] Tor M. Aamodt,et al. Sparse Weight Activation Training , 2020, NeurIPS.
[36] Zhangyang Wang,et al. Efficient Lottery Ticket Finding: Less Data is More , 2021, ICML.
[37] Yee Whye Teh,et al. Set Transformer , 2018, ICML.
[38] Baining Guo,et al. Learning Texture Transformer Network for Image Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[39] M. Ashby,et al. Exploiting Unstructured Sparsity on Next-Generation Datacenter Hardware , 2019 .
[40] Liu Yang,et al. Long Range Arena: A Benchmark for Efficient Transformers , 2020, ICLR.
[41] Yanzhi Wang,et al. Reweighted Proximal Pruning for Large-Scale Language Representation , 2019, ArXiv.
[42] Mark Chen,et al. Generative Pretraining From Pixels , 2020, ICML.
[43] Yin Yang,et al. Compressing Large-Scale Transformer-Based Models: A Case Study on BERT , 2020, Transactions of the Association for Computational Linguistics.
[44] Mykola Pechenizkiy,et al. Selfish Sparse RNN Training , 2021, ICML.
[45] Kai Han,et al. Visual Transformer Pruning , 2021, ArXiv.
[46] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[47] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[48] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[49] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[50] Gintare Karolina Dziugaite,et al. Linear Mode Connectivity and the Lottery Ticket Hypothesis , 2019, ICML.
[51] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[52] Yue Wang,et al. Drawing early-bird tickets: Towards more efficient training of deep networks , 2019, ICLR.
[53] Mykola Pechenizkiy,et al. Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training , 2021, ICML.
[54] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[55] Shuicheng Yan,et al. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, ArXiv.
[56] Haoyu Ma,et al. Good Students Play Big Lottery Better , 2021, ArXiv.
[57] Luowei Zhou,et al. End-to-End Dense Video Captioning with Masked Transformer , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[58] Dustin Tran,et al. Image Transformer , 2018, ICML.
[59] Kai Han,et al. CMT: Convolutional Neural Networks Meet Vision Transformers , 2021, ArXiv.
[60] Martin Jaggi,et al. On the Relationship between Self-Attention and Convolutional Layers , 2019, ICLR.
[61] Jack Xin,et al. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.
[62] Hao Zhou,et al. Less Is More: Towards Compact CNNs , 2016, ECCV.
[63] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[64] Pavlo Molchanov,et al. Importance Estimation for Neural Network Pruning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[66] Erich Elsen,et al. Fast Sparse ConvNets , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Junying Chen,et al. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[69] Luke Zettlemoyer,et al. Sparse Networks from Scratch: Faster Training without Losing Performance , 2019, ArXiv.
[70] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[71] Gordon Erlebacher,et al. The Generalization-Stability Tradeoff in Neural Network Pruning , 2019, NeurIPS.
[72] Xiaojie Jin,et al. DeepViT: Towards Deeper Vision Transformer , 2021, ArXiv.
[73] Zhe Gan,et al. Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly , 2021, ArXiv.
[74] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[75] B. Brookes,et al. Statistical Theory of Extreme Values and Some Practical Applications , 1955, The Mathematical Gazette.
[76] MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers , 2020, ArXiv.
[77] Jianfeng Gao,et al. Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks , 2020, ECCV.
[78] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[79] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[80] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[81] Dacheng Tao,et al. Patch Slimming for Efficient Vision Transformers , 2021, ArXiv.
[82] Aude Oliva,et al. IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers , 2021, NeurIPS.
[83] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[84] Nan Duan,et al. Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training , 2019, AAAI.
[85] Erich Elsen,et al. The Difficulty of Training Sparse Neural Networks , 2019, ArXiv.
[86] Avirup Sil,et al. Structured Pruning of a BERT-based Question Answering Model , 2019 .
[87] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[88] Philip H. S. Torr,et al. SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.
[89] Wen Gao,et al. Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[90] Razvan Pascanu,et al. Top-KAST: Top-K Always Sparse Training , 2021, NeurIPS.
[91] Tim Salimans,et al. Axial Attention in Multidimensional Transformers , 2019, ArXiv.
[92] Zhangyang Wang,et al. A Unified Lottery Ticket Hypothesis for Graph Neural Networks , 2021, ICML.
[93] Lukasz Kaiser,et al. Rethinking Attention with Performers , 2020, ArXiv.
[94] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[95] Tao Zhang,et al. A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.
[96] Shiyu Chang,et al. The Lottery Ticket Hypothesis for Pre-trained BERT Networks , 2020, NeurIPS.