Computing semantic similarity from bilingual dictionaries

In this paper, we address the task of calculating monoand bilingual semantic similarity. We introduce a method that, in order to arrive at a measure of semantic relatedness, exploits the information implicitly contained in bilingual dictionaries. Through experiments we show that our method performs well, with a performance comparable to approaches based on hierarchical knowledge bases and corpus statistics. The advantage of our approach is that it solely relies on easily available bilingual dictionaries and that it is capable of computing monoand bilingual semantic relatedness at the same time.

[1]  M. Felisa Verdejo,et al.  Using Eurowordnet in a Concept-Based Approach to Cross-Language Text Retrieval , 1999, Appl. Artif. Intell..

[2]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[3]  Yoshihiko Nitta,et al.  Co-Occurrence Vectors From Corpora vs. Distance Vectors From Dictionaries , 1994, COLING.

[4]  David Yarowsky,et al.  Inducing Translation Lexicons via Diverse Similarity Measures and Bridge Languages , 2002, CoNLL.

[5]  Iryna Gurevych,et al.  Computing Semantic Relatedness of GermaNet Concepts , 2005 .

[6]  Li Ning,et al.  Using Information Content to Evaluate Semantic Similarity on HowNet , 2012, CIS 2012.

[7]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[8]  Hermann Ney,et al.  A Comparison of Alignment Models for Statistical Machine Translation , 2000, COLING.

[9]  Rada Mihalcea,et al.  Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge , 2009, EMNLP.

[10]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[11]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[12]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[13]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[14]  Michael Roth,et al.  Corpus Co-Occurrence, Dictionary and Wikipedia Entries as Resources for Semantic Relatedness Information , 2008, LREC.

[15]  Cyril Belica,et al.  CCDB: A Corpus-Linguistic Research & Development Workbench , 2007 .