Second-order contexts from lexical substitutes for few-shot learning of word representations

There is a growing awareness of the need to handle rare and unseen words in word representation modelling. In this paper, we focus on few-shot learning of emerging concepts that fully exploits only a few available contexts. We introduce a substitute-based context representation technique that can be applied on an existing word embedding space. Previous context-based approaches to modelling unseen words only consider bag-of-word first-order contexts, whereas our method aggregates contexts as second-order substitutes that are produced by a sequence-aware sentence completion model. We experimented with three tasks that aim to test the modelling of emerging concepts. We found that these tasks show different emphasis on first and second order contexts, and our substitute-based method achieves superior performance on naturally-occurring contexts from corpora.

[1]  Jan Snajder,et al.  Leveraging Lexical Substitutes for Unsupervised Word Sense Induction , 2018, AAAI.

[2]  Mikhail Khodak,et al.  A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors , 2018, ACL.

[3]  Pascal Vincent,et al.  Learning to Compute Word Embeddings On the Fly , 2017, ArXiv.

[4]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[5]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Chengqing Zong,et al.  Memory, Show the Way: Memory Based Few Shot Word Representation Learning , 2018, EMNLP.

[8]  Naoaki Okazaki,et al.  A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse , 2017, IJCNLP.

[9]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[10]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[11]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[12]  Nigel Collier,et al.  Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources , 2017, EACL.

[13]  Marco Baroni,et al.  High-risk learning: acquiring new word vectors from tiny data , 2017, EMNLP.

[14]  Enis Sert,et al.  AI-KU: Using Substitute Vectors and Co-Occurrence Modeling For Word Sense Induction and Disambiguation , 2013, SemEval@NAACL-HLT.

[15]  Ido Dagan,et al.  Modeling Word Meaning in Context with Substitute Vectors , 2015, NAACL.

[16]  Angeliki Lazaridou,et al.  Multimodal Word Meaning Induction From Minimal Exposure to Natural Text. , 2017, Cognitive science.

[17]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..