DQR: a probabilistic approach to diversified query recommendation

Web search queries issued by casual users are often short and with limited expressiveness. Query recommendation is a popular technique employed by search engines to help users refine their queries. Traditional similarity-based methods, however, often result in redundant and monotonic recommendations. We identify five basic requirements of a query recommendation system. In particular, we focus on the requirements of redundancy-free and diversified recommendations. We propose the DQR framework, which mines a search log to achieve two goals: (1) It clusters search log queries to extract query concepts, based on which recommended queries are selected. (2) It employs a probabilistic model and a greedy heuristic algorithm to achieve recommendation diversification. Through a comprehensive user study we compare DQR against five other recommendation methods. Our experiment shows that DQR outperforms the other methods in terms of relevancy, diversity, and ranking performance of the recommendations.

[1]  Mark Sanderson,et al.  Ambiguous queries: test collections need more sense , 2008, SIGIR '08.

[2]  Xueqi Cheng,et al.  A unified framework for recommending diverse and relevant queries , 2011, WWW.

[3]  Xueqi Cheng,et al.  A structured approach to query recommendation with social annotation data , 2010, CIKM.

[4]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[5]  Michael R. Lyu,et al.  Diversifying Query Suggestion Results , 2010, AAAI.

[6]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[7]  Berthier A. Ribeiro-Neto,et al.  Concept-based interactive query expansion , 2005, CIKM '05.

[8]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[9]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[10]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[11]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[12]  Kenneth Ward Church,et al.  Query suggestion using hitting time , 2008, CIKM '08.

[13]  Xueqi Cheng,et al.  Intent-aware query similarity , 2011, CIKM '11.

[14]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[15]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[16]  Fabrizio Silvestri,et al.  Aging effects on query flow graphs for query suggestion , 2009, CIKM.

[17]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[18]  Hongbo Deng,et al.  Entropy-biased models for query representation on the click graph , 2009, SIGIR.

[19]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[20]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[21]  Yi Chen,et al.  Query Expansion Based on Clustered Results , 2011, Proc. VLDB Endow..

[22]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[23]  ChengXiang Zhai,et al.  Learn from web search logs to organize search results , 2007, SIGIR.