Multi-word complex concept retrieval via lexical semantic similarity

This paper first presents a simple computational means of measuring universal object similarity that is based on classical feature-based similarity models. This computational model is implemented with the help of semantic network representations (e.g. WordNet taxonomy) and corpus statistics. It is then extended and applied to a higher level and practical information retrieval task-retrieving multi-word complex concepts. The extension is performed by pair-wise comparison of all decomposed sub-concepts or terms in a query and the texts, trying different schemes for combining averaging and maximization of the pair-wise similarities. Series of experiments are conducted to compare it with classic statistical methods and the results are supportive of our work.

[1]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[2]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[3]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[4]  A. Tversky Features of Similarity , 1977 .

[5]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[6]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[7]  Alan F. Smeaton,et al.  Experiments on using semantic distances between words in image caption retrieval , 1996, SIGIR '96.

[8]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[9]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[10]  D. Gentner,et al.  Structural Alignment during Similarity Comparisons , 1993, Cognitive Psychology.

[11]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[12]  Robert L. Goldstone Similarity, interactive activation, and mapping , 1994 .

[13]  Robert L. Goldstone,et al.  Alignment-based nonmonotonicities in similarity. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[14]  Mark Lauer,et al.  Designing Statistical Language Learners: Experiments on Noun Compounds , 1996, ArXiv.

[15]  John C. Gower,et al.  Measures of Similarity, Dissimilarity and Distance , 1985 .

[16]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[17]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.