BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
暂无分享,去创建一个
[1] Walter Daelemans,et al. Pattern for Python , 2012, J. Mach. Learn. Res..
[2] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[3] Rémi Louf,et al. Transformers : State-ofthe-art Natural Language Processing , 2019 .
[4] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[5] Jacob Eisenstein,et al. Mimicking Word Embeddings using Subword RNNs , 2017, EMNLP.
[6] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[7] Marco Baroni,et al. High-risk learning: acquiring new word vectors from tiny data , 2017, EMNLP.
[8] Mikhail Khodak,et al. A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors , 2018, ACL.
[9] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[10] Guy Emerson,et al. Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models , 2019, DeepLo@EMNLP-IJCNLP.
[11] Dejing Dou,et al. HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.
[12] Angeliki Lazaridou,et al. Multimodal Word Meaning Induction From Minimal Exposure to Natural Text. , 2017, Cognitive science.
[13] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[14] Hinrich Schütze,et al. Rare Words: A Major Problem for Contextualized Embeddings And How to Fix it by Attentive Mimicking , 2019, AAAI.
[15] Jimmy J. Lin,et al. What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning , 2019, ArXiv.
[16] Aline Villavicencio,et al. Incorporating Subword Information into Matrix Factorization Word Embeddings , 2018, ArXiv.
[17] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[18] Luke S. Zettlemoyer,et al. Cloze-driven Pretraining of Self-attention Networks , 2019, EMNLP.
[19] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[20] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[21] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.
[22] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[23] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[24] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[25] Anna Korhonen,et al. Second-order contexts from lexical substitutes for few-shot learning of word representations , 2019, *SEM@NAACL-HLT.
[26] Hinrich Schütze,et al. Attentive Mimicking: Better Word Embeddings by Attending to Informative Contexts , 2019, NAACL-HLT.
[27] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[30] Jens Lehmann,et al. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.
[31] Ido Dagan,et al. context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.
[32] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[33] Fabrizio Silvestri,et al. Misspelling Oblivious Word Embeddings , 2019, NAACL.
[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[35] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.