Multiple intents re-ranking

One of the most fundamental problems in web search is how to re-rank result web pages based on user logs. Most traditional models for re-ranking assume each query has a single intent. That is, they assume all users formulating the same query have similar preferences over the result web pages. It is clear that this is not true for a large portion of queries as different users may have different preferences over the result web pages. Accordingly, a more accurate model should assume that queries have multiple intents. In this paper, we introduce the multiple intents re-ranking problem. This problem captures scenarios in which some user makes a query, and there is no information about its real search intent. In such cases, one would like to re-rank the search results in a way that minimizes the efforts of all users in finding their relevant web pages. More formally, the setting of this problem consists of various types of users, each of which interested in some subset of the search results. Moreover, each user type has a non-negative profile vector. Consider some ordering of the search results. This order sets a position for each search result, and induces a position vector of the results relevant to each user type. The overhead of a user type is the dot product of its profile vector and its induced position vector. The goal is to order the search results as to minimize the average overhead of the users. Our main result is an O(log r)-approximation algorithm for the problem, where r is the maximum number of search results that are relevant to any user type. The algorithm is based on a new technique, which we call harmonic interpolation. In addition, we consider two important special cases. The first case is when the profile vector of each user type is non-increasing. This case is a generalization of the well-known min-sum set cover problem. We extend the techniques of Feige, Lovasz and Tetali (Algorithmica '04), and present an algorithm achieving 4-approximation. The second case is when the profile vector of each user type is non-decreasing. This case generalizes the minimum latency set cover problem, introduced by Hassin and Levin (ESA '05). We devise an LP-based algorithm that attains 2-approximation for it.

[1]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[2]  Edith Cohen,et al.  Efficient sequences of trials , 2003, SODA '03.

[3]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[4]  Mihir Bellare,et al.  On Chromatic Sums and Distributed Resource Allocation , 1998, Inf. Comput..

[5]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[6]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[7]  Jennifer Widom,et al.  The Pipelined Set Cover Problem , 2005, ICDT.

[8]  Yong Yu,et al.  Identifying ambiguous queries in web search , 2007, WWW '07.

[9]  Refael Hassin,et al.  An Approximation Algorithm for the Minimum Latency Set Cover Problem , 2005, ESA.

[10]  Haim Kaplan,et al.  Learning with attribute costs , 2005, STOC '05.

[11]  Farhad Shahrokhi,et al.  The maximum concurrent flow problem , 1990, JACM.

[12]  Maurice Queyranne,et al.  Structure of a simple scheduling polyhedron , 1993, Math. Program..

[13]  Jochen Könemann,et al.  Faster and simpler algorithms for multicommodity flow and other fractional packing problems , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[14]  László Lovász,et al.  Approximating Min Sum Set Cover , 2004, Algorithmica.

[15]  Jennifer Widom,et al.  Adaptive ordering of pipelined stream filters , 2004, SIGMOD '04.

[16]  Guy Kortsarz,et al.  A Matched Approximation Bound for the Sum of a Greedy Coloring , 1999, Inf. Process. Lett..

[17]  Jochen Könemann,et al.  Faster and Simpler Algorithms for Multicommodity Flow and Other Fractional Packing Problems , 2007, SIAM J. Comput..

[18]  Andreas S. Schulz Scheduling to Minimize Total Weighted Completion Time: Performance Guarantees of LP-Based Heuristics and Lower Bounds , 1996, IPCO.