Clustering Using a Hybrid Query Similarity Measure

Query clustering is a task that groups similar queries automatically without using predetermined class descriptions. Such clusters can be used to discover the common interests of online information seekers to exploit their collective search experience for the benefit of others. Since similarity is fundamental to the definition of a cluster, measures of similarity between two queries is essential to the query clustering procedure. This paper introduces a hybrid query similarity measure that uses both query terms and the results returned to queries. Experiments show that the hybrid approach can generate query clusters of better overall quality than existing similarity measures. Key-Words: Information retrieval, query mining, query clustering, similarity measures.