BAM! Born-Again Multi-Task Networks for Natural Language Understanding
暂无分享,去创建一个
Quoc V. Le | Christopher D. Manning | Minh-Thang Luong | Urvashi Khandelwal | Kevin Clark | Minh-Thang Luong | Urvashi Khandelwal | Kevin Clark
[1] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[2] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[3] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[4] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[5] Alex Wang,et al. Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling , 2018, ACL.
[6] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[7] Di He,et al. Multilingual Neural Machine Translation with Knowledge Distillation , 2019, ICLR.
[8] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[9] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[10] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[11] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.
[12] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[13] Xiaodong Liu,et al. Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding , 2019, ArXiv.
[14] Xuanjing Huang,et al. Adversarial Multi-task Learning for Text Classification , 2017, ACL.
[15] Richard Socher,et al. Unifying Question Answering, Text Classification, and Regression via Span Extraction , 2019 .
[16] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.
[17] Samuel R. Bowman,et al. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.
[18] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[19] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Zhi Jin,et al. Distilling Word Embeddings: An Encoding Approach , 2015, CIKM.
[22] Anders Søgaard,et al. Deep multi-task learning with low level tasks supervised at lower layers , 2016, ACL.
[23] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[24] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[25] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[26] Rich Caruana,et al. Model compression , 2006, KDD '06.
[27] Barbara Plank,et al. When is multitask learning effective? Semantic sequence prediction under varying data conditions , 2016, EACL.
[28] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[29] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[30] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[31] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[32] Richard Socher,et al. Unifying Question Answering and Text Classification via Span Extraction , 2019, ArXiv.
[33] Joachim Bingel,et al. Identifying beneficial task relations for multi-task learning in deep neural networks , 2017, EACL.
[34] Noah A. Smith,et al. Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser , 2016, EMNLP.
[35] Yoshimasa Tsuruoka,et al. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.
[36] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[37] Alexander M. Rush,et al. Sequence-Level Knowledge Distillation , 2016, EMNLP.
[38] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .
[39] Thomas Wolf,et al. A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks , 2018, AAAI.
[40] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[41] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[42] Joachim Bingel,et al. Latent Multi-Task Architecture Learning , 2017, AAAI.
[43] Alex Wang,et al. Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling , 2018, ArXiv.
[44] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.