Query Expansion by Mining User Logs

Queries to search engines on the Web are usually short. They do not provide sufficient information for an effective selection of relevant documents. Previous research has proposed the utilization of query expansion to deal with this problem. However, expansion terms are usually determined on term co-occurrences within documents. In this study, we propose a new method for query expansion based on user interactions recorded in user logs. The central idea is to extract correlations between query terms and document terms by analyzing user logs. These correlations are then used to select high-quality expansion terms for new queries. Compared to previous query expansion methods, ours takes advantage of the user judgments implied in user logs. The experimental results show that the log-based query expansion method can produce much better results than both the classical search method and the other query expansion methods.

[1]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[2]  Karen Sparck Jones Automatic keyword classification for information retrieval , 1971 .

[3]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[4]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[5]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[6]  Bert R. Boyce,et al.  Online information retrieval concepts, principles, and techniques , 1987, J. Am. Soc. Inf. Sci..

[7]  Vijay V. Raghavan,et al.  On modeling of information retrieval concepts in vector spaces , 1987, TODS.

[8]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[9]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[10]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  James Allan,et al.  Automatic Retrieval With Locality Information Using SMART , 1992, TREC.

[13]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.

[14]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[15]  Yiyu Yao,et al.  A Probabilistic Method for Computing Term-by-Term Relationships , 1993, J. Am. Soc. Inf. Sci..

[16]  David A. Evans,et al.  Design and Evaluation of the CLARIT-TREC-2 System , 1993, TREC.

[17]  Efthimis N. Efthimiadis,et al.  UCLA-Okapi at TREC-2: Query Expansion Experiments , 1993, TREC.

[18]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[19]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[20]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[21]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[22]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[23]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[24]  W. Bruce Croft,et al.  Providing Government Information on the Internet: Experiences with THOMAS , 1995, DL.

[25]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[26]  Stefano Mizzaro,et al.  Evaluating user interfaces to information retrieval systems: a case study on user support , 1996, SIGIR '96.

[27]  Jianhua Dong,et al.  Ad Hoc Experiments Using EUREKA , 1996, TREC.

[28]  James W. Cooper,et al.  Lexical navigation: visually prompted query expansion and refinement , 1997, DL '97.

[29]  Claire Cardie,et al.  Using clustering and SuperConcepts within SMART: TREC 6 , 1997, Inf. Process. Manag..

[30]  Chris Buckley,et al.  Improving automatic query expansion , 1998, SIGIR '98.

[31]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[32]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[33]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[34]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[35]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.