Measuring Word Semantic Similarity Based on Transferred Vectors

Semantic similarity between words has now become a popular research problem to tackle in natural language processing (NLP) field. Word embedding have been demonstrated progress in measuring word similarity recently. However, limited to the distributional hypothesis, basic embedding methods generally have drawbacks in nature. One of the limitations is that word embeddings are usually by predicting a target word in its local context, leading to only limited information being captured. In this paper, we propose a novel transferred vectors approach to compute word semantic similarity. Transferred vectors are obtained via a reasonable combination of the source word and its nearest neighbors on semantic level. We conduct experiments on popular both English and Chinese benchmarks for measuring word similarity. The experiment results demonstrate that our method outperforms previous state-of-the-art by a large margin.

[1]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[2]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[3]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[4]  Wei Li,et al.  Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Similarity Measurement , 2016, NLPCC/ICCPOL.

[5]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[6]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[7]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[8]  Qiang Dong,et al.  KNOWLEDGE DICTIONARY OF HOWNET , 2006 .

[9]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[10]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[11]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[12]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[13]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[14]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[15]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[16]  Zhao Wei Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive Learning System , 2010 .

[17]  Changning Huang,et al.  Improving query translation for cross-language information retrieval using statistical models , 2001, SIGIR '01.

[18]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19]  Yue Zhang,et al.  Feature Embedding for Dependency Parsing , 2014, COLING.

[20]  Diana Inkpen,et al.  Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[23]  Hui Wang,et al.  Word Similarity Computing Based on HowNet and Synonymy Thesaurus , 2019, IntelliSys.

[24]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Alexander Panchenko,et al.  A semantic similarity measure based on lexico-syntactic patterns , 2012, KONVENS.

[26]  Ossama Emam,et al.  Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement , 2006, EMNLP.