Cloze-driven Pretraining of Self-attention Networks
暂无分享,去创建一个
Luke S. Zettlemoyer | Sergey Edunov | Alexei Baevski | Michael Auli | Yinhan Liu | Yinhan Liu | Luke Zettlemoyer | Michael Auli | Alexei Baevski | Sergey Edunov
[1] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[2] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[3] J Quinonero Candela,et al. Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.
[4] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[5] Peter Clark,et al. The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.
[6] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[7] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[8] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[9] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[10] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[13] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[14] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Ido Dagan,et al. context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.
[17] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[18] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[19] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[20] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[21] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.
[22] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[23] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[24] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.
[25] Samuel R. Bowman,et al. Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Syntactic Task Analysis , 2018, BlackboxNLP@EMNLP.
[26] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[27] Philipp Koehn,et al. Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.
[28] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[29] Dan Klein,et al. Constituency Parsing with a Self-Attentive Encoder , 2018, ACL.
[30] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[31] Prakhar Gupta,et al. Learning Word Vectors for 157 Languages , 2018, LREC.
[32] Samuel R. Bowman,et al. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.
[33] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[34] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[35] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[36] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[37] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.