mNIR: Diversifying Search Results Based on a Mixture of Novelty, Intention and Relevance

Current search engines do not explicitly take different meanings and usages of user queries into consideration when they rank the search results. As a result, they tend to retrieve results that cover the most popular meanings or usages of the query. Consequently, users who want results that cover a rare meaning or usage of query or results that cover all different meanings/usages may have to go through a large number of results in order to find the desired ones. Another problem with current search engines is that they do not adequately take users' intention into consideration. In this paper, we introduce a novel result ranking algorithm (mNIR) that explicitly takes result novelty, user intention-based distribution and result relevancy into consideration and mixes them to achieve better result ranking. We analyze how giving different emphasis to the above three aspects would impact the overall ranking of the results. Our approach builds on our previous method for identifying and ranking possible categories of any user query based on the meanings and usages of the terms and phrases within the query. These categories are also used to generate category queries for retrieving results matching different meanings/usages of the original user query. Our experimental results show that the proposed algorithm can outperform state-of-the-art diversification approaches.

[1]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[2]  Emre Velipasaoglu,et al.  Intent-based diversification of web search results: metrics and algorithms , 2011, Information Retrieval.

[3]  Clement T. Yu,et al.  Categorizing Search Results Using WordNet and Wikipedia , 2012, WAIM.

[4]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[5]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[6]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[7]  Craig MacDonald,et al.  Intent-aware search result diversification , 2011, SIGIR.

[8]  Fabrizio Silvestri,et al.  Efficient Diversification of Web Search Results , 2011, Proc. VLDB Endow..

[9]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[10]  Krishna Bharat,et al.  Diversifying web search results , 2010, WWW '10.

[11]  David R. Karger,et al.  Less is More Probabilistic Models for Retrieving Fewer Relevant Documents , 2006 .

[12]  Clement T. Yu,et al.  Identifying and Ranking Possible Semantic and Common Usage Categories of Search Engine Queries , 2010, WISE.

[13]  Stephen E. Robertson,et al.  Simple Evaluation Metrics for Diversified Search Results , 2010, EVIA@NTCIR.

[14]  Jaana Kekäläinen,et al.  Discounted Cumulated Gain , 2009, Encyclopedia of Database Systems.

[15]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[16]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[17]  ChengXiang Zhai,et al.  Risk minimization and language modeling in text retrieval dissertation abstract , 2002, SIGF.

[18]  Evaggelia Pitoura,et al.  Search result diversification , 2010, SGMD.

[19]  Yunjie Xu,et al.  Novelty and topicality in interactive information retrieval , 2008 .

[20]  Craig MacDonald,et al.  Selectively diversifying web search results , 2010, CIKM.

[21]  Tetsuya Sakai,et al.  Evaluating diversified search results using per-intent graded relevance , 2011, SIGIR.

[22]  Sreenivas Gollapudi,et al.  An axiomatic approach for result diversification , 2009, WWW '09.

[23]  Mark Sanderson,et al.  Multiple approaches to analysing query diversity , 2009, SIGIR.

[24]  Torsten Suel,et al.  Web Information Systems Engineering - WISE 2010 - 11th International Conference, Hong Kong, China, December 12-14, 2010. Proceedings , 2010, WISE.