Semantic Structure-Based Word Embedding by Incorporating Concept Convergence and Word Divergence

Representing the semantics of words is a fundamental task in text processing. Several research studies have shown that text and knowledge bases (KBs) are complementary sources for word embedding learning. Most existing methods only consider relationships within word-pairs in the usage of KBs. We argue that the structural information of well-organized words within the KBs is able to convey more effective and stable knowledge in capturing semantics of words. In this paper, we propose a semantic structure-based word embedding method, and introduce concept convergence and word divergence to reveal semantic structures in the word embedding learning process. To assess the effectiveness of our method, we use WordNet for training and conduct extensive experiments on word similarity, word analogy, text classification and query expansion. The experimental results show that our method outperforms state-of-the-art methods, including the methods trained solely on the corpus, and others trained on the corpus and the KBs.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[3]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[4]  Guangquan Zhang,et al.  Uncertainty Analysis for the Keyword System of Web Events , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yu Hu,et al.  Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints , 2015, ACL.

[7]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[8]  Eneko Agirre,et al.  Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet , 2016, AAAI.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Jie Lu,et al.  Bayesian Nonparametric Relational Topic Model through Dependent Gamma Processes , 2017, IEEE Transactions on Knowledge and Data Engineering.

[11]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Simone Paolo Ponzetto,et al.  Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems , 2010, ACL.

[14]  Oren Barkan,et al.  Bayesian Neural Word Embedding , 2016, AAAI.

[15]  Anastasios Tefas,et al.  Entropy Optimized Feature-Based Bag-of-Words Representation for Information Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[18]  Mark Dredze,et al.  Improving Lexical Embeddings with Semantic Knowledge , 2014, ACL.

[19]  Anna Korhonen,et al.  An Unsupervised Model for Instance Level Subcategorization Acquisition , 2014, EMNLP.

[20]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.

[21]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[22]  Nan Jiang,et al.  Word Embedding Based Correlation Model for Question/Answer Matching , 2015, AAAI.

[23]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[24]  Gang Wang,et al.  RC-NET: A General Framework for Incorporating Knowledge into Word Representations , 2014, CIKM.

[25]  Ken-ichi Kawarabayashi,et al.  Joint Word Representation Learning Using a Corpus and a Semantic Lexicon , 2015, AAAI.

[26]  Gareth J. F. Jones,et al.  Word Vector Compositionality based Relevance Feedback using Kernel Density Estimation , 2016, CIKM.

[27]  Richard Johansson,et al.  Embedding a Semantic Network in a Word Space , 2015, NAACL.

[28]  Ingemar J. Cox,et al.  Enhancing Feature Selection Using Word Embeddings: The Case of Flu Surveillance , 2017, WWW.

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[31]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[32]  Wei Lu,et al.  Improving Word Embeddings with Convolutional Feature Learning and Subword Information , 2017, AAAI.

[33]  Ken-ichi Kawarabayashi,et al.  Learning Word Representations from Relational Graphs , 2014, AAAI.

[34]  Gemma Boleda,et al.  Distributional Semantics in Technicolor , 2012, ACL.

[35]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[36]  Ken-ichi Kawarabayashi,et al.  Embedding Semantic Relations into Word Representations , 2015, IJCAI.

[37]  Danushka Bollegala,et al.  Cross-Domain Sentiment Classification Using Sentiment Sensitive Embeddings , 2016, IEEE Transactions on Knowledge and Data Engineering.

[38]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[39]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..