On correlation of absence time and search effectiveness

Online search evaluation metrics are typically derived based on implicit feedback from the users. For instance, computing the number of page clicks, number of queries, or dwell time on a search result. In a recent paper, Dupret and Lalmas introduced a new metric called absence time, which uses the time interval between successive sessions of users to measure their satisfaction with the system. They evaluated this metric on a version of Yahoo! Answers. In this paper, we investigate the effectiveness of absence time in evaluating new features in a web search engine, such as new ranking algorithm or a new user interface. We measured the variation of absence time to the effects of 21 experiments performed on a search engine. Our findings show that the outcomes of absence time agreed with the judgement of human experts performing a thorough analysis of a wide range of online and offline metrics in 14 out of these 21 cases. We also investigated the relationship between absence time and a set of commonly-used covariates (features) such as the number of queries and clicks in the session. Our results suggest that users are likely to return to the search engine sooner when their previous session has more queries and more clicks.

[1]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[2]  Mounia Lalmas,et al.  Absence time and user engagement: evaluating ranking functions , 2013, WSDM '13.

[3]  Hao Jiang,et al.  Mining User Dwell Time for Personalized Web Search Re-Ranking , 2011, IJCAI.

[4]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[5]  D.,et al.  Regression Models and Life-Tables , 2022 .

[6]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[7]  Kuansan Wang,et al.  PSkip: estimating relevance ranking quality from web search clickthrough data , 2009, KDD.

[8]  Ben Carterette,et al.  Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks , 2007, NIPS.

[9]  Ryen W. White,et al.  Modeling dwell time to predict click-level satisfaction , 2014, WSDM.

[10]  Marc J. Leclere,et al.  PREFACE Modeling Time to Event: Applications of Survival Analysis in Accounting, Economics and Finance , 2005 .

[11]  Darja Boršič,et al.  Duration of Regional Unemployment Spells in Slovenia , 2009 .

[12]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[13]  Ryen W. White,et al.  No clicks, no problem: using cursor movements to understand and improve search , 2011, CHI.

[14]  Wei Chu,et al.  Modeling the impact of short- and long-term behavior on search personalization , 2012, SIGIR '12.

[15]  Wei Chu,et al.  Enhancing personalized search by mining and modeling task behavior , 2013, WWW.

[16]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[17]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[18]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[19]  M. Grace,et al.  Hazard Ratio in Clinical Trials , 2004, Antimicrobial Agents and Chemotherapy.

[20]  Benjamin Piwowarski,et al.  Web Search Engine Evaluation Using Clickthrough Data and a User Model , 2007 .

[21]  Nick Craswell,et al.  Beyond clicks: query reformulation as a predictor of search satisfaction , 2013, CIKM.

[22]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.