论文信息 - Clustering Using a Hybrid Query Similarity Measure

Clustering Using a Hybrid Query Similarity Measure

Query clustering is a task that groups similar queries automatically without using predetermined class descriptions. Such clusters can be used to discover the common interests of online information seekers to exploit their collective search experience for the benefit of others. Since similarity is fundamental to the definition of a cluster, measures of similarity between two queries is essential to the query clustering procedure. This paper introduces a hybrid query similarity measure that uses both query terms and the results returned to queries. Experiments show that the hybrid approach can generate query clusters of better overall quality than existing similarity measures. Key-Words: Information retrieval, query mining, query clustering, similarity measures.

Dion Hoe-Lian Goh | Lin Fu | Schubert Shou-Boon Foo

[1] Craig Silverstein,et al. Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[2] Larry Fitzpatrick,et al. Automatic feedback using past queries: social searching? , 1997, SIGIR '97.

[3] Carolyn J. Crouch,et al. The automatic generation of extended queries , 1989, SIGIR '90.

[4] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[5] Osmar R. Zaïane,et al. Finding Similar Queries to Satisfy Searches Based on Query Traces , 2002, OOIS Workshops.

[6] Natalie S. Glance,et al. Community search assistant , 2001, IUI '01.

[7] Vijay V. Raghavan,et al. On the reuse of past optimal queries , 1995, SIGIR '95.

[8] Ji-Rong Wen,et al. Query clustering using user logs , 2002, TOIS.