Neural Networks with Attention for Word Sense Induction
暂无分享,去创建一个
Attentional neural networks have achieved remarkable results for a number of tasks in the past few years. The fascinating success of neural networks with attention mechanism in natural language processing, especially in machine translation, suggests that these models can capture the meaning of ambiguous words considering their context. In this paper we introduce a new method for constructing vectors of ambiguous words occurrences for word sense induction based on the recently introduced model Transformer that achieved state of the art results for machine translation. Similar to the CBOW model for constructing word embeddings we train the Transformer to predict a word from it’s context and use its trained parameters for word sense induction. On some datasets the proposed method outperforms the simple but hard-to-beat baseline, which was among the best three methods in the recent shared task on word sense induction for the Russian language RUSSE-WSI 2018. On one of the datasets our method beats the top result from the competition. Furthermore, we explore how different methods of weighing word embeddings affect the performance in word sense induction. Together with weighted sums of word2vec vectors, we explore the performance of vectors from Transformer’s hidden layers and introduce a combined approach that improves previous results.
[1] Alexander Panchenko,et al. How much does a word weigh? Weighting word embeddings for word sense induction , 2018, ArXiv.
[2] J. H. Ward. Hierarchical Grouping to Optimize an Objective Function , 1963 .
[3] Dmitry Ustalov,et al. RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language , 2018, ArXiv.
[4] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.