暂无分享,去创建一个
Jun Huang | Songfang Huang | Baobao Chang | Fei Huang | Fuli Luo | Chengyu Wang | Runxin Xu | Fei Huang | Songfang Huang | Jun Huang | Baobao Chang | Fuli Luo | Runxin Xu | Chengyu Wang
[1] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[2] Furu Wei,et al. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers , 2020, NeurIPS.
[3] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[4] Beliz Gunel,et al. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning , 2020, ICLR.
[5] Alexander M. Rush,et al. Movement Pruning: Adaptive Sparsity by Fine-Tuning , 2020, NeurIPS.
[6] Dan Roth,et al. Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior , 2020, FINDINGS.
[7] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Max Welling,et al. Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.
[9] Preslav Nakov,et al. Poor Man's BERT: Smaller and Faster Transformer Models , 2020, ArXiv.
[10] Yixin Liu,et al. SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization , 2021, ACL.
[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[12] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[13] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[14] Yanzhi Wang,et al. Reweighted Proximal Pruning for Large-Scale Language Representation , 2019, ArXiv.
[15] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.
[16] Kyunghyun Cho,et al. Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models , 2020, ICLR.
[17] Armen Aghajanyan,et al. Better Fine-Tuning by Reducing Representational Collapse , 2020, ICLR.
[18] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[19] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[20] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[21] Tianyu Gao,et al. SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.
[22] Madian Khabsa,et al. CLEAR: Contrastive Learning for Sentence Representation , 2020, ArXiv.
[23] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[24] Michael W. Mahoney,et al. Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT , 2019, AAAI.
[25] Xiaodong Liu,et al. Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization , 2021, ACL.
[26] Ji Li,et al. Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning , 2020, FINDINGS.
[27] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[28] Anna Rumshisky,et al. When BERT Plays the Lottery, All Tickets Are Winning , 2020, EMNLP.
[29] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[30] Wanxiang Che,et al. Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting , 2020, EMNLP.
[31] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[32] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[33] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[34] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[35] Ashish Khetan,et al. schuBERT: Optimizing Elements of BERT , 2020, ACL.
[36] Yu Cheng,et al. Patient Knowledge Distillation for BERT Model Compression , 2019, EMNLP.
[37] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[38] Qun Liu,et al. DynaBERT: Dynamic BERT with Adaptive Width and Depth , 2020, NeurIPS.
[39] Jinxi Zhao,et al. Rethinking Network Pruning – under the Pre-train and Fine-tune Paradigm , 2021, NAACL.