Probabilistic query expansion using query logs

Query expansion has long been suggested as an effective way to resolve the short query and word mismatching problems. A number of query expansion methods have been proposed in traditional information retrieval. However, these previous methods do not take into account the specific characteristics of web searching; in particular, of the availability of large amount of user interaction information recorded in the web query logs. In this study, we propose a new method for query expansion based on query logs. The central idea is to extract probabilistic correlations between query terms and document terms by analyzing query logs. These correlations are then used to select high-quality expansion terms for new queries. The experimental results show that our log-based probabilistic query expansion method can greatly improve the search performance and has several advantages over other existing methods.

[1]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[2]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[3]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[4]  Chris Buckley,et al.  Improving automatic query expansion , 1998, SIGIR '98.

[5]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[6]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[7]  Jianhua Dong,et al.  Ad Hoc Experiments Using EUREKA , 1996, TREC.

[8]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[9]  Karen Sparck Jones Automatic keyword classification for information retrieval , 1971 .

[10]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[13]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[14]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[15]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[16]  Claire Cardie,et al.  Using clustering and SuperConcepts within SMART: TREC 6 , 1997, Inf. Process. Manag..

[17]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[18]  Aviezri S. Fraenkel,et al.  Local Feedback in Full-Text Retrieval Systems , 1977, JACM.