New Word Pair Level Embeddings to Improve Word Pair Similarity

We present a novel approach for computing similarity of English word pairs. While many previous approaches compute cosine similarity of individually computed word embeddings, we compute a single embedding for the word pair that is suited for similarity computation. Such embeddings are then used to train a machine learning model. Testing results on MEN and WordSim-353 datasets demonstrate that for the task of word pair similarity, computing word pair embeddings is better than computing word embeddings only.

[1]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[2]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[3]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[4]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[5]  Junpeng Chen,et al.  Combining ConceptNet and WordNet for Word Sense Disambiguation , 2011, IJCNLP.

[6]  Jian-Yun Nie,et al.  Diversified query expansion using conceptnet , 2013, CIKM.

[7]  Peter D. Turney Measuring Semantic Similarity by Latent Relational Analysis , 2005, IJCAI.

[8]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[9]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[10]  Alexander Kotov,et al.  An Empirical Comparison of Statistical Term Association Graphs with DBpedia and ConceptNet for Query Expansion , 2015, FIRE.

[11]  Lin Dai,et al.  An English-Chinese Cross-lingual Word Semantic Similarity Measure Exploring Attributes and Relations , 2011, PACLIC.

[12]  Jimmy J. Lin,et al.  Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement , 2016, NAACL.

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[15]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[16]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[17]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[18]  Robyn Speer,et al.  ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge , 2017, *SEMEVAL.

[19]  Hsin-Hsi Chen,et al.  Query Expansion with ConceptNet and WordNet: An Intrinsic Comparison , 2006, AIRS.

[20]  Preslav Nakov,et al.  Combining Relational and Attributional Similarity for Semantic Relation Classification , 2011, RANLP.

[21]  Chris Buckley,et al.  Improving automatic query expansion , 1998, SIGIR '98.

[22]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[23]  James Curran,et al.  Ensemble Methods for Automatic Thesaurus Extraction , 2002, EMNLP.

[24]  R. Speer,et al.  An Ensemble Method to Produce High-Quality Word Embeddings , 2016, ArXiv.

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[26]  Maria Soledad Pera,et al.  A naïve Bayes Classifier for Web Document Summaries Created by Using Word Similarity and Significant Factors , 2010, Int. J. Artif. Intell. Tools.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  Yue Lu,et al.  Integrating word embeddings and traditional NLP features to measure textual entailment and semantic relatedness of sentence pairs , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[29]  Fanghuai Hu,et al.  Self-Supervised Synonym Extraction from the Web , 2015, J. Inf. Sci. Eng..

[30]  Robert L. Goldstone,et al.  Similarity Involving Attributes and Relations: Judgments of Similarity and Difference Are Not Inverses , 1990 .

[31]  Pengfei Wang,et al.  Assessing Sentence Similarity Using WordNet based Word Similarity , 2013, J. Softw..