MUSE: Modularizing Unsupervised Sense Embeddings

This paper proposes to address the word sense ambiguity issue in an unsupervised manner, where word sense representations are learned along a word sense selection mechanism given contexts. Prior work focused on designing a single model to deliver both mechanisms, and thus suffered from either coarse-grained representation learning or inefficient sense selection. The proposed modular approach, MUSE, implements flexible modules to optimize distinct mechanisms, achieving the first purely sense-level representation learning system with linear-time sense selection. We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on sense selection for better robustness. The experiments on benchmark data show that the proposed approach achieves the state-of-the-art performance on synonym selection as well as on contextual word similarities in terms of MaxSimC.

[1]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[2]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Richard Johansson,et al.  Neural context embeddings for automatic discovery of word senses , 2015, VS@HLT-NAACL.

[6]  Tao Chen,et al.  Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering , 2015, ACL.

[7]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[8]  Xuanjing Huang,et al.  Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model , 2015, IJCAI.

[9]  Marine Carpuat,et al.  Retrofitting Sense-Specific Word Vectors Using Parallel Text , 2016, HLT-NAACL.

[10]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[11]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[12]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[13]  Wanxiang Che,et al.  Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources , 2014, COLING.

[14]  Enhong Chen,et al.  A Probabilistic Model for Learning Multi-Prototype Word Embeddings , 2014, COLING.

[15]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Ivan Titov,et al.  Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders , 2016, HLT-NAACL.

[18]  Daniel Jurafsky,et al.  Do Multi-Sense Embeddings Improve Natural Language Understanding? , 2015, EMNLP.

[19]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Douglas Stott Parker,et al.  $K$-Embeddings: Learning Conceptual Embeddings for Words using Context , 2016, HLT-NAACL.

[22]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[23]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[24]  Chris Dyer,et al.  Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models , 2015, NAACL.

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[26]  Anton Osokin,et al.  Breaking Sticks and Ambiguities with Adaptive Skip-gram , 2015, AISTATS.

[27]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[28]  Raymond J. Mooney,et al.  Multi-Prototype Vector-Space Models of Word Meaning , 2010, NAACL.

[29]  Nigel Collier,et al.  De-Conflated Semantic Representations , 2016, EMNLP.

[30]  Regina Barzilay,et al.  Molding CNNs for text: non-linear, non-consecutive convolutions , 2015, EMNLP.

[31]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[32]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.