Duluth : Measuring Cross-Level Semantic Similarity with First and Second Order Dictionary Overlaps

This paper describes the Duluth systems that participated in the Cross‐Level Semantic Similarity task of SemEval‐2014. These three systems were all unsupervised and relied on a dictionary melded together from various sources, and used first‐order (Lesk) and second‐order (Vector) overlaps to measure similarity. The first‐order overlaps fared well according to Spearman’s correlation (top 5) but less so relative to Pearson’s. Most systems performed at comparable levels for both Spearman’s and Pearson’s measure, which suggests the Duluth approach is potentially unique among the participating systems.

[1]  Ted Pedersen,et al.  Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text , 2013, J. Biomed. Informatics.

[2]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[3]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[4]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[5]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[6]  Ted Pedersen,et al.  Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces , 2004, CoNLL.

[7]  Siddharth Patwardhan,et al.  Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatednes , 2003 .

[8]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[9]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[10]  Ted Pedersen,et al.  UMLS-Interface and UMLS-Similarity : Open Source Software for Measuring Paths and Semantic Similarity , 2009, AMIA.

[11]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[12]  Ted Pedersen,et al.  Using Measures of Semantic Relatedness for Word Sense Disambiguation , 2003, CICLing.

[13]  T. Kwon Adapting the Lesk Algorithm for Word Sense Disambiguation to WordNet by Satanjeev Banerjee , 2002 .

[14]  Ted Pedersen,et al.  Semantic relatedness study using second order co-occurrence vectors computed from biomedical corpora, UMLS and WordNet , 2012, IHI '12.

[15]  Roberto Navigli,et al.  SemEval-2014 Task 3: Cross-Level Semantic Similarity , 2014, *SEMEVAL.