A Study on the Semantic Relatedness of Query and Document Terms in Information Retrieval

The use of lexical semantic knowledge in information retrieval has been a field of active study for a long time. Collaborative knowledge bases like Wikipedia and Wiktionary, which have been applied in computational methods only recently, offer new possibilities to enhance information retrieval. In order to find the most beneficial way to employ these resources, we analyze the lexical semantic relations that hold among query and document terms and compare how these relations are represented by a measure for semantic relatedness. We explore the potential of different indicators of document relevance that are based on semantic relatedness and compare the characteristics and performance of the knowledge bases Wikipedia, Wiktionary and WordNet.

[1]  Takenobu Tokunaga,et al.  The Use of WordNet in Information Retrieval , 1998, WordNet@ACL/COLING.

[2]  Benno Stein,et al.  A Wikipedia-Based Multilingual Retrieval Model , 2008, ECIR.

[3]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[4]  JonesK. Sparck,et al.  A probabilistic model of information retrieval , 2000 .

[5]  Iryna Gurevych,et al.  Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval , 2008, CLEF.

[6]  Iryna Gurevych,et al.  Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding , 2009, ACL.

[7]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[8]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[9]  Iryna Gurevych,et al.  Using Wiktionary for Computing Semantic Relatedness , 2008, AAAI.

[10]  Philipp Cimiano,et al.  Cross-language Information Retrieval with Explicit Semantic Analysis , 2008, CLEF.

[11]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[12]  Evgeniy Gabrilovich,et al.  Concept-Based Feature Generation and Selection for Information Retrieval , 2008, AAAI.

[13]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[14]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[15]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[16]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[17]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[18]  Olga Vechtomova,et al.  A Study of Document Relevance and Lexical Cohesion between Query Terms , 2005 .

[19]  András A. Benczúr,et al.  Performing Cross-Language Retrieval with Wikipedia , 2007, CLEF.

[20]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[21]  Max Mühlhäuser,et al.  Comparing Wikipedia and German Wordnet by Evaluating Semantic Relatedness on Multiple Datasets , 2007, NAACL.

[22]  Iryna Gurevych,et al.  What to be? - Electronic Career Guidance Based on Semantic Relatedness , 2007, ACL.

[23]  P. Buitelaar,et al.  Web-based Ontology Learning with ISOLDE , 2022 .