An Effective Information Retrieval for Ambiguous Query

Search engine returns thousands of web pages for a single user query, in which most of them are not relevant. In this context, effective information retrieval from the expanding web is a challenging task, in particular, if the query is ambiguous. The major question arises here is that how to get the relevant pages for an ambiguous query. We propose an approach for the effective result of an ambiguous query by forming community vectors based on association concept of data mining using vector space model and the freedictionary. We develop clusters by computing the similarity between community vectors and document vectors formed from the extracted web pages by the search engine. We use Gensim package to implement the algorithm because of its simplicity and robust nature. Analysis shows that our approach is an effective way to form clusters for an ambiguous query.

[1]  Roberto Navigli,et al.  Inducing Word Senses to Improve Web Search Result Clustering , 2010, EMNLP.

[2]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[3]  Marco Cristo,et al.  Exploring features for the automatic identification of user goals in web search , 2010, Inf. Process. Manag..

[4]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[5]  Randy Goebel,et al.  An Unsupervised Approach to Cluster Web Search Results Based on Word Sense Communities , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[6]  Dekang Lin,et al.  PRINCIPAR - An Efficient, Broad-coverage, Principle-based Parser , 1994, COLING.

[7]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[8]  Mark Sanderson,et al.  Ambiguous queries: test collections need more sense , 2008, SIGIR '08.

[9]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.