Rare Words: A Major Problem for Contextualized Embeddings And How to Fix it by Attentive Mimicking
暂无分享,去创建一个
[1] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[2] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[3] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[4] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.
[5] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[6] David J. Weir,et al. Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.
[7] Aline Villavicencio,et al. Incorporating Subword Information into Matrix Factorization Word Embeddings , 2018, ArXiv.
[8] Mikhail Khodak,et al. A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors , 2018, ACL.
[9] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.
[10] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[11] Alessandro Lenci,et al. How we BLESSed distributional semantic evaluation , 2011, GEMS.
[12] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .
[13] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[15] Marco Marelli,et al. Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics , 2013, ACL.
[16] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[17] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[18] Marco Baroni,et al. High-risk learning: acquiring new word vectors from tiny data , 2017, EMNLP.
[19] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[20] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.
[21] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.
[22] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Kevin Gimpel,et al. Charagram: Embedding Words and Sentences via Character n-grams , 2016, EMNLP.
[25] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[26] Hinrich Schütze,et al. Attentive Mimicking: Better Word Embeddings by Attending to Informative Contexts , 2019, NAACL-HLT.
[27] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[28] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[29] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[30] Jacob Eisenstein,et al. Mimicking Word Embeddings using Subword RNNs , 2017, EMNLP.