Modeling dwell time to predict click-level satisfaction

Clicks on search results are the most widely used behavioral signals for predicting search satisfaction. Even though clicks are correlated with satisfaction, they can also be noisy. Previous work has shown that clicks are affected by position bias, caption bias, and other factors. A popular heuristic for reducing this noise is to only consider clicks with long dwell time, usually equaling or exceeding 30 seconds. The rationale is that the more time a searcher spends on a page, the more likely they are to be satisfied with its contents. However, having a single threshold value assumes that users need a fixed amount of time to be satisfied with any result click, irrespective of the page chosen. In reality, clicked pages can differ significantly. Pages have different topics, readability levels, content lengths, etc. All of these factors may affect the amount of time spent by the user on the page. In this paper, we study the effect of different page characteristics on the time needed to achieve search satisfaction. We show that the topic of the page, its length and its readability level are critical in determining the amount of dwell time needed to predict whether any click is associated with satisfaction. We propose a method to model and provide a better understanding of click dwell time. We estimate click dwell time distributions for SAT (satisfied) or DSAT (dissatisfied) clicks for different click segments and use them to derive features to train a click-level satisfaction model. We compare the proposed model to baseline methods that use dwell time and other search performance predictors as features, and demonstrate that the proposed model achieves significant improvements.

[1]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[2]  Nicholas J. Belkin,et al.  Display time as implicit feedback: understanding task effects , 2004, SIGIR '04.

[3]  Xiaolong Li,et al.  An Overview of Microsoft Web N-gram Corpus and Applications , 2010, NAACL.

[4]  Susan T. Dumais,et al.  Classification-enhanced ranking , 2010, WWW '10.

[5]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[6]  Eugene Agichtein,et al.  Beyond dwell time: estimating document relevance from cursor movements and other post-click searcher behavior , 2012, WWW.

[7]  Ahmed Hassan Awadallah,et al.  Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.

[8]  Ryen W. White,et al.  A study on the effects of personalization and task information on implicit feedback performance , 2006, CIKM '06.

[9]  Yang Song,et al.  A task level metric for measuring web search satisfaction and its application on improving relevance estimation , 2011, CIKM '11.

[10]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[11]  Peifeng Yin,et al.  Silence is also evidence: interpreting dwell time for recommendation from psychological perspective , 2013, KDD.

[12]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[13]  Ryen W. White,et al.  Understanding web browsing behaviors through Weibull analysis of dwell time , 2010, SIGIR.

[14]  Ryen W. White,et al.  Predicting query performance using query, result, and user interaction features , 2010, RIAO.

[15]  Ryen W. White,et al.  Playing by the rules: mining query associations to predict search performance , 2013, WSDM.

[16]  Aristides Gionis,et al.  The query-flow graph: model and applications , 2008, CIKM '08.

[17]  Eugene Agichtein,et al.  Find it if you can: a game for modeling different types of web search success using interaction data , 2011, SIGIR.

[18]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[19]  Ryen W. White,et al.  Assessing the scenic route: measuring the value of search trails in web logs , 2010, SIGIR.

[20]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[22]  Kevyn Collins-Thompson,et al.  Statistical Estimation of Word Acquisition with Application to Readability Prediction , 2009, EMNLP.

[23]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[24]  Qiang Yang,et al.  Deep classification in large-scale text hierarchies , 2008, SIGIR '08.

[25]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[26]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[27]  Jaime Teevan,et al.  Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[28]  James Allan,et al.  Predicting searcher frustration , 2010, SIGIR.

[29]  Jure Leskovec,et al.  Web projections: learning from contextual subgraphs of the web , 2007, WWW '07.

[30]  Yang Song,et al.  A Task Level User Satisfaction Metric and its Application on Improving Relevance Estimation , 2011 .

[31]  S. C. Choi,et al.  Maximum Likelihood Estimation of the Parameters of the Gamma Distribution and Their Bias , 1969 .

[32]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[33]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[34]  Nicholas J. Belkin,et al.  Reading time, scrolling and interaction: exploring implicit sources of user preferences for relevance feedback , 2001, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[35]  W. Bruce Croft,et al.  Ranking robustness: a novel framework to predict query performance , 2006, CIKM '06.

[36]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[37]  Iadh Ounis,et al.  Query performance prediction , 2006, Inf. Syst..

[38]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[39]  Andreas Dengel,et al.  Segment-level display time as implicit feedback: a comparison to eye tracking , 2009, SIGIR.

[40]  Scott B. Huffman,et al.  How well does result relevance predict session satisfaction? , 2007, SIGIR.

[41]  Ahmed Hassan Awadallah,et al.  A semi-supervised approach to modeling web search satisfaction , 2012, SIGIR '12.

[42]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[43]  Mark Claypool,et al.  Implicit interest indicators , 2001, IUI '01.

[44]  Kevyn Collins-Thompson,et al.  A Language Modeling Approach to Predicting Reading Difficulty , 2004, NAACL.

[45]  Nick Craswell,et al.  Beyond clicks: query reformulation as a predictor of search satisfaction , 2013, CIKM.

[46]  R. Zamar,et al.  A multivariate Kolmogorov-Smirnov test of goodness of fit , 1997 .