暂无分享,去创建一个
Mehdi Rezagholizadeh | Vasileios Lioutas | Ahmad Rashid | Abbas Ghaddar | Ahmad Rashid | Mehdi Rezagholizadeh | Vasileios Lioutas | Abbas Ghaddar
[1] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[2] Ankur P. Parikh,et al. Thieves on Sesame Street! Model Extraction of BERT-based APIs , 2019, ICLR.
[3] Yuhong Guo,et al. Time-aware Large Kernel Convolutions , 2020, ICML.
[4] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[5] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[6] Changshui Zhang,et al. Few Sample Knowledge Distillation for Efficient Network Compression , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[8] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[10] Andrew McCallum,et al. Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.
[11] Michael W. Mahoney,et al. Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT , 2019, AAAI.
[12] Vinod Ganapathy,et al. A framework for the extraction of Deep Neural Networks by leveraging public data , 2019, ArXiv.
[13] Rich Caruana,et al. Model compression , 2006, KDD '06.
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[15] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[16] Yong Cheng,et al. Robust Neural Machine Translation with Doubly Adversarial Inputs , 2019, ACL.
[17] R. Venkatesh Babu,et al. Zero-Shot Knowledge Distillation in Deep Networks , 2019, ICML.
[18] Dan Klein,et al. Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers , 2020, ArXiv.
[19] Amos Storkey,et al. Zero-shot Knowledge Transfer via Adversarial Belief Matching , 2019, NeurIPS.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Thad Starner,et al. Data-Free Knowledge Distillation for Deep Neural Networks , 2017, ArXiv.
[22] Hongbo Zhang,et al. Quora Question Pairs , 2017 .
[23] Qi Tian,et al. Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] U Kang,et al. Knowledge Extraction with No Observable Data , 2019, NeurIPS.
[25] Andrew M. Dai,et al. Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.
[26] Quan Z. Sheng,et al. Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey , 2019 .
[27] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[28] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[29] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[30] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[31] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[32] Dejing Dou,et al. HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.
[33] Matt J. Kusner,et al. GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution , 2016, ArXiv.
[34] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[35] Christopher D. Manning,et al. Compression of Neural Machine Translation Models via Pruning , 2016, CoNLL.
[36] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.