Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expansion is introduced. This method which is a combination of relevance feedback and latent semantic analysis, finds the relative terms to the topics of user original query based on relevant documents selected by the user in relevance feedback step. The method is evaluated and compared with the Rocchio relevance feedback. The results of this evaluation indicate the capability of the method to better representation of user’s information need and increasing significantly user satisfaction.

[1]  Tapio Salakoski,et al.  Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: Method and clinical application , 2008, Int. J. Medical Informatics.

[2]  Anne Aula,et al.  Query Formulation in Web Information Search , 2003, ICWI.

[3]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[4]  Elaine Toms,et al.  Enterprise search behaviour of software engineers , 2006, SIGIR.

[5]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[6]  Jungyun Seo,et al.  A reliable FAQ retrieval system using a query log classification technique based on latent semantic analysis , 2007, Inf. Process. Manag..

[7]  Christos Faloutsos,et al.  A Comparative Study of Feature Vector-Based Topic Detection Schemes A Comparative Study of Feature Vector-Based Topic Detection Schemes , 2005, International Workshop on Challenges in Web Information Retrieval and Integration.

[8]  Amanda Spink,et al.  A temporal comparison of AltaVista Web searching , 2005, J. Assoc. Inf. Sci. Technol..

[9]  Wei-Pang Yang,et al.  Text summarization using a trainable summarizer and latent semantic analysis , 2005, Inf. Process. Manag..

[10]  Takenobu Tokunaga,et al.  Query expansion using heterogeneous thesauri , 2000, Inf. Process. Manag..

[11]  Susan T. Dumais,et al.  Improving information retrieval using latent semantic indexing , 1988 .

[12]  Wei Song,et al.  Genetic algorithm for text clustering based on latent semantic indexing , 2009, Comput. Math. Appl..

[13]  Vimla L. Patel,et al.  Simulating expert clinical comprehension: Adapting latent semantic analysis to accurately extract clinical concepts from psychiatric narrative , 2008, J. Biomed. Informatics.

[14]  Juan C. Valle-Lisboa,et al.  The uncovering of hidden structures by Latent Semantic Analysis , 2007, Inf. Sci..

[15]  Ángel F. Zazo Rodríguez,et al.  Reformulation of queries using similarity thesauri , 2005, Inf. Process. Manag..

[16]  Amanda Spink,et al.  An Analysis of Web Documents Retrieved and Viewed , 2003, International Conference on Internet Computing.

[17]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[18]  Johanna Enberg,et al.  Query Expansion , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[19]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[20]  Bo Yu,et al.  Latent semantic analysis for text categorization using neural network , 2008, Knowl. Based Syst..

[21]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .