Cross-Topic Distributional Semantic Representations Via Unsupervised Mappings

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a word based on different topics. First, a separate DSM is trained for each topic and then each of the topic-based DSMs is aligned to a common vector space. Our unsupervised mapping approach is motivated by the hypothesis that words preserving their relative distances in different topic semantic sub-spaces constitute robust \textit{semantic anchors} that define the mappings between them. Aligned cross-topic representations achieve state-of-the-art results for the task of contextual word similarity. Furthermore, evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models.

[1]  Chris Quirk,et al.  Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[2]  Yi Chen,et al.  Learning Context-Specific Word/Character Embeddings , 2017, AAAI.

[3]  Shi Feng,et al.  The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task , 2017, WMT.

[4]  Xuanjing Huang,et al.  Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model , 2015, IJCAI.

[5]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[6]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[7]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[8]  Stefan Thater,et al.  A Mixture Model for Learning Multi-Sense Word Embeddings , 2017, *SEMEVAL.

[9]  Daniel Jurafsky,et al.  Do Multi-Sense Embeddings Improve Natural Language Understanding? , 2015, EMNLP.

[10]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[11]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[12]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[13]  Raymond J. Mooney,et al.  A Mixture Model with Sharing for Lexical Semantics , 2010, EMNLP.

[14]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.

[15]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[16]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[17]  Jordan L. Boyd-Graber,et al.  Inducing and Embedding Senses with Scaled Gumbel Softmax , 2018, ArXiv.

[18]  Eneko Agirre,et al.  Learning principled bilingual mappings of word embeddings while preserving monolingual invariance , 2016, EMNLP.

[19]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[23]  Philip Resnik,et al.  Learning Text Pair Similarity with Context-sensitive Autoencoders , 2016, ACL.

[24]  Pietro Liò,et al.  Learning Rare Word Representations using Semantic Bridging , 2017, ArXiv.

[25]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[26]  Enhong Chen,et al.  A Probabilistic Model for Learning Multi-Prototype Word Embeddings , 2014, COLING.

[27]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[28]  Nigel Collier,et al.  De-Conflated Semantic Representations , 2016, EMNLP.

[29]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[30]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[31]  Yun-Nung Chen,et al.  MUSE: Modularizing Unsupervised Sense Embeddings , 2017, EMNLP.

[32]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[33]  Peter D. Turney Domain and Function: A Dual-Space Model of Semantic Relations and Compositions , 2012, J. Artif. Intell. Res..

[34]  Alexandros Potamianos,et al.  Mixture of Topic-Based Distributional Semantic and Affective Models , 2018, 2018 IEEE 12th International Conference on Semantic Computing (ICSC).

[35]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.