Web has grown to a huge mass of information resource and is diverse in content. To search such rich source of information one has to be very precise in using keywords in queries to retrieve the relevant documents. Most of the queries issued to search engines are short and have ambiguous context. One way to produce effective queries is by automatic query expansion. Work has been done in this field to use the local and global techniques. The global techniques examine word occurrences and relationships in the corpus as a whole and use this information to expand a particular query. Local context analysis examines the concept occurrences and relationship in top ranked documents retrieved by the original input query to expand the same query. Query log of search engines is used by researchers to expand the input queries using the clicked documents related to any of the terms of input query in query session of query log. In this paper a new local analysis technique is proposed which make use of information need of query sessions modeled using Information Scent and content of clicked documents to select the clicked documents for query expansion. Information scent is the subjective sense of value and cost of accessing a page based on perceptual cues with respect to the information need of the user. The input query issued in a particular domain is used to select the set of documents associated with the information need of the query sessions in the same domain and used as local corpora to provide related set of terms to be added to the input query. The resulting expanded query is used to retrieve the relevant documents from the same retrieval system. This approach is unique as it is using those documents in local corpora which belong to the information need associated with the domain in which input query is issued using Information Scent and content of clicked pages in the query sessions and direct the search in a fruitful direction by expanding initial input query using set of related terms. Experimental study of the proposed approach is done on the data set extracted from Web history of ldquoGooglerdquo search engine and improvement in the information retrieval precision with low computation complexity during online processing of input queries confirms the effectiveness of the proposed approach.
[1]
G. Karypis,et al.
Criterion functions for document clustering
,
2005
.
[2]
Ji-Rong Wen,et al.
Query clustering using user logs
,
2002,
TOIS.
[3]
Amanda Spink,et al.
Real life information retrieval: a study of user queries on the Web
,
1998,
SIGF.
[4]
Wei-Ying Ma,et al.
Query Expansion by Mining User Logs
,
2003,
IEEE Trans. Knowl. Data Eng..
[5]
Peter Pirolli,et al.
Computational models of information scent-following in a very large browsable text collection
,
1997,
CHI.
[6]
George Karypis,et al.
Comparison of Agglomerative and Partitional Document Clustering Algorithms
,
2002
.
[7]
Punam Bedi,et al.
Improving Information Retrieval Precision Using Query Log Mining and Information Scent
,
2007
.
[8]
W. Bruce Croft,et al.
Query expansion using local and global document analysis
,
1996,
SIGIR '96.
[9]
Vijay V. Raghavan,et al.
Information Retrieval on the World Wide Web
,
1997,
IEEE Internet Comput..
[10]
Ed H. Chi,et al.
Using information scent to model user information needs and actions and the Web
,
2001,
CHI.
[11]
Peter Pirolli,et al.
The Use of Proximal Information Scent to Forage for Distal Content on the World Wide Web
,
2004
.