Empirical Exploitation of Click Data for Task Specific Ranking

There have been increasing needs for task specific rankings in web search such as rankings for specific query segments like long queries, time-sensitive queries, navigational queries, etc; or rankings for specific domains/contents like answers, blogs, news, etc. In the spirit of "divide-and-conquer", task specific ranking may have potential advantages over generic ranking since different tasks have task-specific features, data distributions, as well as feature-grade correlations. A critical problem for the task-specific ranking is training data insufficiency, which may be solved by using the data extracted from click log. This paper empirically studies how to appropriately exploit click data to improve rank function learning in task-specific ranking. The main contributions are 1) the exploration on the utilities of two promising approaches for click pair extraction; 2) the analysis of the role played by the noise information which inevitably appears in click data extraction; 3) the appropriate strategy for combining training data and click data; 4) the comparison of click data which are consistent and inconsistent with baseline function.

[1]  Ya Zhang,et al.  Adapting ranking functions to user preference , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[2]  Barry Smyth,et al.  Supporting intelligent Web search , 2007, TOIT.

[3]  Filip Radlinski,et al.  Active exploration for learning rankings from clickthrough data , 2007, KDD '07.

[4]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[5]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[6]  Tao Qin,et al.  Global Ranking Using Continuous Conditional Random Fields , 2008, NIPS.

[7]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[8]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[9]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[12]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[13]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[14]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[15]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[16]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[17]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[18]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[19]  Ophir Frieder,et al.  Varying approaches to topical web query classification , 2007, SIGIR.

[20]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[21]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[22]  ChengXiang Zhai,et al.  Learn from web search logs to organize search results , 2007, SIGIR.

[23]  Natalie S. Glance,et al.  Community search assistant , 2001, IUI '01.

[24]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[25]  Hang Li,et al.  Ranking refinement and its application to information retrieval , 2008, WWW.

[26]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[27]  David Maxwell Chickering,et al.  Here or there: preference judgments for relevance , 2008 .

[28]  Hongyuan Zha,et al.  Global ranking by exploiting user clicks , 2009, SIGIR.

[29]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[30]  Hongyuan Zha,et al.  Incorporating query difference for learning retrieval functions in world wide web search , 2006, CIKM '06.

[31]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[32]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.