Understanding the Semantic Content of Sparse Word Embeddings Using a Commonsense Knowledge Base

Word embeddings have developed into a major NLP tool with broad applicability. Understanding the semantic content of word embeddings remains an important challenge for additional applications. One aspect of this issue is to explore the interpretability of word embeddings. Sparse word embeddings have been proposed as models with improved interpretability. Continuing this line of research, we investigate the extent to which human interpretable semantic concepts emerge along the bases of sparse word representations. In order to have a broad framework for evaluation, we consider three general approaches for constructing sparse word representations, which are then evaluated in multiple ways. We propose a novel methodology to evaluate the semantic content of word embeddings using a commonsense knowledge base, applied here to the sparse case. This methodology is illustrated by two techniques using the ConceptNet knowledge base. The first approach assigns a commonsense concept label to the individual dimensions of the embedding space. The second approach uses a metric, derived by spreading activation, to quantify the coherence of coordinates along the individual axes. We also provide results on the relationship between the two approaches. The results show, for example, that in the individual dimensions of sparse word embeddings, words having high coefficients are more semantically related in terms of path lengths in the knowledge base than the ones having zero coefficients.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Tom M. Mitchell,et al.  Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding , 2012, COLING.

[3]  Guillaume Lample,et al.  Evaluation of Word Vector Representations by Subspace Alignment , 2015, EMNLP.

[4]  Dimitrios I. Diochnos Commonsense Reasoning and Large Network Analysis: A Computational Study of ConceptNet 4 , 2013, ArXiv.

[5]  Harsh Jhamtani,et al.  SPINE: SParse Interpretable Neural Embeddings , 2017, AAAI.

[6]  London András,et al.  Commonsense knowledge bases and network analysis , 2013 .

[7]  Goran Glavas,et al.  Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.

[8]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[9]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[10]  M. Ross Quillian,et al.  The teachable language comprehender: a simulation program and theory of language , 1969, CACM.

[11]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[12]  Shashi Narayan,et al.  Encoding Prior Knowledge with Eigenword Embeddings , 2015, TACL.

[13]  Gerard Salton,et al.  On the use of spreading activation methods in automatic information , 1988, SIGIR '88.

[14]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[15]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[16]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[17]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[18]  Fabien L. Gandon,et al.  Adapting Semantic Spreading Activation to Entity Linking in Text , 2016, NLDB.

[19]  Tom M. Mitchell,et al.  Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases , 2014, EMNLP.

[20]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[21]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[22]  Gábor Berend,et al.  Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling , 2016, TACL.

[23]  M. Ross Quillian,et al.  Retrieval time from semantic memory , 1969 .

[24]  Brian Harrington A Semantic Network Approach to Measuring Relatedness , 2010, COLING.

[25]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[26]  Yulia Tsvetkov,et al.  Sparse Overcomplete Word Vector Representations , 2015, ACL.

[27]  Yulia Tsvetkov,et al.  Correlation-based Intrinsic Evaluation of Word Vector Representations , 2016, RepEval@ACL.

[28]  Aykut Koç,et al.  Semantic Structure and Interpretability of Word Embeddings , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[29]  Jelena Jovanovic,et al.  The state of the art in semantic relatedness: a framework for comparison , 2017, The Knowledge Engineering Review.

[30]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[31]  Danushka Bollegala,et al.  Jointly learning word embeddings using a corpus and a knowledge base , 2018, PloS one.