Combining and learning word embedding with WordNet for semantic relatedness and similarity measurement

In this research, we propose 3 different approaches to measure the semantic relatedness between 2 words: (i) boost the performance of GloVe word embedding model via removing or transforming abnormal dimensions; (ii) linearly combine the information extracted from WordNet and word embeddings; and (iii) utilize word embedding and 12 linguistic information extracted from WordNet as features for Support Vector Regression. We conducted our experiments on 8 benchmark data sets, and computed Spearman correlations between the outputs of our methods and the ground truth. We report our results together with 3 state‐of‐the‐art approaches. The experimental results show that our method can outperform state‐of‐the‐art approaches in all the selected English benchmark data sets.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[4]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[5]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[6]  Michael Pucher WordNet-based Semantic Relatedness Measures in Automatic Speech Recognition for Meetings , 2007, ACL.

[7]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[8]  G. Razran Semantic and phonetographic generalizations of salivary conditioning to verbal stimuli. , 1949, Journal of experimental psychology.

[9]  Christoph Lofi,et al.  Measuring Semantic Similarity and Relatedness with Distributional and Knowledge- based Approaches , 2015 .

[10]  Utpal Garain,et al.  Using Word Embeddings for Automatic Query Expansion , 2016, ArXiv.

[11]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[12]  Roberto Navigli,et al.  NASARI: a Novel Approach to a Semantically-Aware Representation of Items , 2015, NAACL.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Nick Craswell,et al.  Query Expansion with Locally-Trained Word Embeddings , 2016, ACL.

[15]  Irene Koshik,et al.  Journal of the american society for information science and technology-2012 , 2012 .

[16]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[17]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[18]  Roy Rada,et al.  Ranking documents with a thesaurus , 1989, JASIS.

[19]  Jérôme Euzenat,et al.  A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness , 2010, SEMWEB.

[20]  Wen-tau Yih,et al.  Measuring Word Relatedness Using Heterogeneous Vector Space Models , 2012, HLT-NAACL.

[21]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[22]  Peng Jin,et al.  SemEval-2012 Task 4: Evaluating Chinese Word Similarity , 2012, SemEval@NAACL-HLT.

[23]  Roy Schwartz,et al.  Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.

[24]  Adam Kilgarriff,et al.  of the European Chapter of the Association for Computational Linguistics , 2006 .

[25]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[26]  Zhaochen Guo,et al.  Robust named entity disambiguation with random walks , 2018, Semantic Web.

[27]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[28]  Yue Zhang,et al.  Syntactic Dependencies and Distributed Word Representations for Analogy Detection and Mining , 2015, EMNLP.

[29]  Qing-yun Dai,et al.  Research of DSP-based Embedded Systems Connected to the Internet , 2013 .

[30]  Hsin-Hsi Chen,et al.  Combining Word Embedding and Lexical Database for Semantic Relatedness Measurement , 2016, WWW.

[31]  Ken-ichi Kawarabayashi,et al.  Joint Word Representation Learning Using a Corpus and a Semantic Lexicon , 2015, AAAI.

[32]  Vasile Rus,et al.  Lemon and Tea Are Not Similar: Measuring Word-to-Word Similarity by Combining Different Methods , 2015, CICLing.

[33]  Xueqi Cheng,et al.  Inside Out: Two Jointly Predictive Models for Word Representations and Phrase Representations , 2016, AAAI.

[34]  Lingling Meng,et al.  A Review of Semantic Similarity Measures in WordNet 1 , 2013 .

[35]  Paul R. Cohen,et al.  Information retrieval by constrained spreading activation in semantic networks , 1987, Inf. Process. Manag..

[36]  David Vandyke,et al.  Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.

[37]  Michael Sussna,et al.  Word sense disambiguation for free-text indexing using a massive semantic network , 1993, CIKM '93.

[38]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[39]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[40]  Kai-Wei Chang,et al.  Multi-Relational Latent Semantic Analysis , 2013, EMNLP.

[41]  Eneko Agirre,et al.  Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet , 2016, AAAI.

[42]  J. Gabrieli,et al.  Effects of Semantic and Associative Relatedness on Automatic Priming , 1998 .

[43]  Roberto Navigli,et al.  Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity , 2013, ACL.

[44]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[45]  David M. W. Powers,et al.  Verb similarity on the taxonomy of WordNet , 2006 .

[46]  Tieyun Qian,et al.  Enhanced Aspect Level Sentiment Classification with Auxiliary Memory , 2018, COLING.

[47]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[48]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[49]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[50]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[51]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.