Network Flow for Collaborative Ranking

In query based Web search, a significant percentage of user queries are underspecified, most likely by naive users. Collaborative ranking helps the naive user by exploiting the collective expertise. We present a novel algorithmic model inspired by the network flow theory, which constructs a search network based on search engine logs to describe the relationship between the relevant entities in search: queries, documents, and users. This formal model permits the theoretical investigation of the nature of collaborative ranking in more concrete terms, and the learning of the dependence relations among the different entities. FlowRank, an algorithm derived from this model through an analysis of empirical usage patterns, is implemented and evaluated. We empirically show its potential in experiments involving real-world user relevance ratings and a random sample of 1,334 documents and 100 queries from a popular document search engine. Definite improvements over two baseline ranking algorithms for approximately 47% of the queries are reported.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  D. R. Fulkerson,et al.  Maximal Flow Through a Network , 1956 .

[3]  Krishna Prasad Chitrapura,et al.  Node ranking in labeled directed graphs , 2004, CIKM '04.

[4]  Rick Kazman,et al.  WebQuery: Searching and Visualizing the Web Through Connectivity , 1997, Comput. Networks.

[5]  Rick Kazman,et al.  Searching and visualizing the web through connectivity , 1997, The Web Conference.

[6]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[7]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[8]  Jacques Savoy,et al.  Evaluation of Learning Schemes Used in Information Retrieval , 1996 .

[9]  Kwok-Wai Cheung,et al.  Learning User Similarity and Rating Style for Collaborative Recommendation , 2003, Information Retrieval.

[10]  Eric Brill,et al.  Web Search Intent Induction via Automatic Query Reformulation , 2004, NAACL.

[11]  Leonid Zhukov,et al.  Methods for MiningWeb Communities: Bibliometric, Spectral, and Flow , 2004, Web Dynamics.

[12]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[13]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[14]  Osmar R. Zaïane,et al.  Finding Similar Queries to Satisfy Searches Based on Query Traces , 2002, OOIS Workshops.

[15]  Andrei Z. Broder,et al.  The Connectivity Server: Fast Access to Linkage Information on the Web , 1998, Comput. Networks.

[16]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[17]  Kotagiri Ramamohanarao,et al.  Long-Term Learning for Web Search Engines , 2002, PKDD.

[18]  Barry Smyth,et al.  A Live-User Evaluation of Collaborative Web Search , 2005, IJCAI.

[19]  Pertti Vakkari,et al.  Subject Knowledge, Source of Terms, and Term Selection in Query Expansion: An Analytical Study , 2002, ECIR.

[20]  G. Chartrand Introductory Graph Theory , 1984 .

[21]  Boris Chidlovskii,et al.  Collaborative Re-Ranking of Search Results , 2000 .

[22]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[23]  Mark Craven,et al.  First-Order Learning for Web Mining , 1998, ECML.

[24]  Andrew V. Goldberg,et al.  A new approach to the maximum flow problem , 1986, STOC '86.

[25]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[26]  Peter Boros,et al.  Query Segmentation for Web Search , 2003, WWW.

[27]  Robert E. Tarjan,et al.  Graph Clustering and Minimum Cut Trees , 2004, Internet Math..

[28]  Amanda Spink,et al.  An Analysis of Web Documents Retrieved and Viewed , 2003, International Conference on Internet Computing.

[29]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[30]  Steve Chien,et al.  Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[31]  Junji Tomita,et al.  Interactive Web Search by Graphical Query Refinement , 2001, WWW Posters.

[32]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[33]  Shie Mannor,et al.  Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.

[34]  Barry Smyth,et al.  Further Experiments on Collaborative Ranking in Community-Based Web Search , 2004, Artificial Intelligence Review.