暂无分享,去创建一个
Ke Xu | Furu Wei | Canwen Xu | Tao Ge | Wangchunshu Zhou | Julian McAuley | Furu Wei | Julian McAuley | Ke Xu | Canwen Xu | Wangchunshu Zhou | Tao Ge
[1] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[2] Holger Ulmer,et al. Ensemble Methods as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2017, ArXiv.
[3] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[4] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.
[5] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[6] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.
[7] Tianlong Chen,et al. Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference , 2020, ICLR.
[8] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[10] Peter Szolovits,et al. Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment , 2019, ArXiv.
[11] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[12] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[13] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[14] Mohammad Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[15] Roy Schwartz,et al. The Right Tool for the Job: Matching Model and Instance Complexities , 2020, ACL.
[16] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.
[17] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[18] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Peng Zhou,et al. FastBERT: a Self-distilling BERT with Adaptive Inference Time , 2020, ACL.
[20] Furu Wei,et al. BERT-of-Theseus: Compressing BERT by Progressive Module Replacing , 2020, EMNLP.
[21] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[24] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[25] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[26] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[27] Lutz Prechelt,et al. Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.
[28] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[29] Yu Cheng,et al. Patient Knowledge Distillation for BERT Model Compression , 2019, EMNLP.
[30] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[31] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[32] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[33] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[34] Tudor Dumitras,et al. Shallow-Deep Networks: Understanding and Mitigating Network Overthinking , 2018, ICML.
[35] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[36] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[37] Maya R. Gupta,et al. To Trust Or Not To Trust A Classifier , 2018, NeurIPS.
[38] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[39] Douwe Kiela,et al. SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.
[40] Xin Wang,et al. SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.
[41] Ning Chen,et al. Improving Adversarial Robustness via Promoting Ensemble Diversity , 2019, ICML.
[42] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[44] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[45] Hervé Bourlard,et al. Generalization and Parameter Estimation in Feedforward Netws: Some Experiments , 1989, NIPS.
[46] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Hector J. Levesque,et al. The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[48] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[49] Jimmy J. Lin,et al. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference , 2020, ACL.
[50] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[51] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[52] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[53] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[54] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.