THESAURUS AND QUERY EXPANSION

The explosive growth of the World Wide Web is making it difficult for a user to locate information that is relevant to his/her interest. Though existing search engines work well to a certain extent but they still face problems like word mismatch which arises because the majority of information retrieval systems compare query and document terms on lexical level rather than on semantic level and short query: the average length of queries by the user is less than two words. Short queries and the incompatibility between the terms in user queries and documents strongly affect the retrieval of relevant document. Query expansion has long been suggested as a technique to increase the effectiveness of the information retrieval. Query expansion is the process of supplementing additional terms or phrases to the original query to improve the retrieval performance. The central problem of query expansion is the selection of the expansion terms based on which user’s original query is expanded. Thesaurus helps to solve this problem. Thesaurus have frequently been incorporated in information retrieval system for identifying the synonymous expressions and linguistic entities that are semantically similar. Thesaurus has been widely used in many applications, including information retrieval and natural language processing.

[1]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[2]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[3]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[4]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.

[5]  Gobinda G. Chowdhury,et al.  Incorporating the results of co-word analyses to increase search variety for information retrieval , 2000, J. Inf. Sci..

[6]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[7]  Stephen P. Harter,et al.  Online Information Retrieval: Concepts, Principles and Techniques , 1986 .

[8]  Jinxi Xu,et al.  Solving the word mismatch problem through automatic text analysis , 1997 .

[9]  Pertti Vakkari,et al.  Subject knowledge improves interactive query expansion assisted by a thesaurus , 2004, J. Documentation.

[10]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[11]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[12]  Jack Minker,et al.  An evaluation of query expansion by the addition of clustered terms for a document retrieval system , 1972, Inf. Storage Retr..

[13]  Daniel Cunliffe,et al.  Qualitative Evaluation of Thesaurus-Based Retrieval , 2002, ECDL.

[14]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[15]  Martha W. Evens,et al.  Relational thesauri in information retrieval , 1985, J. Am. Soc. Inf. Sci..

[16]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[17]  Bert R. Boyce,et al.  Online information retrieval concepts, principles, and techniques , 1987, J. Am. Soc. Inf. Sci..

[18]  C. J. van Rijsbergen,et al.  The selection of good search terms , 1981, Inf. Process. Manag..

[19]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[20]  Hugo Zaragoza,et al.  Information Retrieval: Algorithms and Heuristics , 2002, Information Retrieval.

[21]  Francisco João Pinto,et al.  Joining automatic query expansion based on thesaurus and word sense disambiguation using WordNet , 2008, Int. J. Comput. Appl. Technol..

[22]  Amanda Spink,et al.  Interaction in Information Retrieval: Selection and Effectiveness of Search Terms , 1997, J. Am. Soc. Inf. Sci..

[23]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[24]  Ali Shiri,et al.  Query expansion behavior within a thesaurus-enhanced search environment: A user-centered evaluation , 2006 .