Semantic Network Based Approach to Compute Term Semantic Similarity

Measuring semantic similarity between two terms is essential for a variety of text analytics and understanding applications.This paper presents a approach for measuring the semantic similarity between terms. Previous work on semantic similarity methods have focused on either the structure of the semantic network between terms, or only on the Information Content (IC) of terms. However, existing approaches are limited by the size of the knowledge base and corpus. We propose an efficient and effective approach for computing semantic similarity using a large scale semantic network. This approach base on Probase, which is a big graph of concepts. Knowledge in Probase is harnessed from billions of web pages and years' worth of search logs. Through experiments performed on well known word similarity datasets, we show that our approach is much more efficient than all competing algorithms.

[1]  Dan Roth,et al.  Robust, Light-weight Approaches to compute Lexical Similarity , 2010 .

[2]  Xindong Wu,et al.  A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity , 2015, IEEE Transactions on Knowledge and Data Engineering.

[3]  Jeffrey Xu Yu,et al.  String Similarity Search: A Hash-Based Approach , 2018, IEEE Transactions on Knowledge and Data Engineering.

[4]  CrockettKeeley,et al.  Sentence Similarity Based on Semantic Nets and Corpus Statistics , 2006 .

[5]  Eneko Agirre,et al.  Exploring Knowledge Bases for Similarity , 2010, LREC.

[6]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[9]  Alexandros Potamianos,et al.  Unsupervised Semantic Similarity Computation between Terms Using Web Documents , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Yu Zhenhua,et al.  Measuring Semantic Similarity between Words Using Wikipedia , 2009, 2009 International Conference on Web Information Systems and Mining.

[11]  Diana Inkpen,et al.  Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[12]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[13]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[14]  Joan A. Smith,et al.  Robust , Light-weight Approaches to compute Lexical Similarity , 2010 .

[15]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[16]  Ganggao Zhu,et al.  Computing Semantic Similarity of Concepts in Knowledge Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[17]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[18]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[19]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[20]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.