Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model

Modern search engine result pages often provide immediate value to users and organize information in such a way that it is easy to navigate. The core ranking function contributes to this and so do result snippets, smart organization of result blocks and extensive use of one-box answers or side panels. While they are useful to the user and help search engines to stand out, such features present two big challenges for evaluation. First, the presence of such elements on a search engine result page (SERP) may lead to the absence of clicks, which is, however, not related to dissatisfaction, so-called 'good abandonments.' Second, the non-linear layout and visual difference of SERP items may lead to non-trivial patterns of user attention, which is not captured by existing evaluation metrics. In this paper we propose a model of user behavior on a SERP that jointly captures click behavior, user attention and satisfaction, the CAS model, and demonstrate that it gives more accurate predictions of user actions and self-reported satisfaction than existing models based on clicks alone. We use the CAS model to build a novel evaluation metric that can be applied to non-linear SERP layouts and that can account for the utility that users obtain directly on a SERP. We demonstrate that this metric shows better agreement with user-reported satisfaction than conventional evaluation metrics.

[1]  M. de Rijke,et al.  Click model-based information retrieval metrics , 2013, SIGIR.

[2]  Alexander J. Smola,et al.  Measurement and modeling of eye-mouse behavior in the presence of nonlinear page layouts , 2013, WWW.

[3]  Fernando Diaz,et al.  Robust models of mouse movement on dynamic web search results pages , 2013, CIKM.

[4]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[5]  Eugene Agichtein,et al.  Exploring mouse movements for inferring query intent , 2008, SIGIR '08.

[6]  Sofia Stamou,et al.  Interpreting User Inactivity on Search Results , 2010, ECIR.

[7]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[8]  Charles L. A. Clarke,et al.  Reliable information retrieval evaluation with incomplete and biased judgements , 2007, SIGIR.

[9]  T. Minka A comparison of numerical optimizers for logistic regression , 2004 .

[10]  Gabriella Kazai,et al.  Quality Management in Crowdsourcing using Gold Judges Behavior , 2016, WSDM.

[11]  Eugene Agichtein,et al.  Towards predicting web searcher gaze position from mouse movements , 2010, CHI Extended Abstracts.

[12]  Eugene Agichtein,et al.  Mining touch interaction data on mobile devices to predict web search result relevance , 2013, SIGIR.

[13]  Yiqun Liu,et al.  From Skimming to Reading: A Two-stage Examination Model for Web Search , 2014, CIKM.

[14]  Ryen W. White,et al.  Improving searcher models using mouse cursor activity , 2012, SIGIR '12.

[15]  Yiqun Liu,et al.  Incorporating Non-sequential Behavior into Click Models , 2015, SIGIR.

[16]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[17]  Alexandra Chouldechova,et al.  Differences in search engine evaluations between query owners and non-owners , 2013, WSDM.

[18]  Lidia Grauer,et al.  Evaluating mobile web search performance by taking good abandonment into account , 2014, SIGIR.

[19]  Lora Aroyo,et al.  The Three Sides of CrowdTruth , 2014, Hum. Comput..

[20]  Huaqing Min,et al.  Improving click model by combining mouse movements with click-through data , 2015, 2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS).

[21]  Kerry Rodden,et al.  Eye-mouse coordination patterns on web search results pages , 2008, CHI Extended Abstracts.

[22]  Ryen W. White,et al.  Leaving so soon?: understanding and predicting web search abandonment rationales , 2012, CIKM.

[23]  Yiqun Liu,et al.  Incorporating vertical results into search click models , 2013, SIGIR.

[24]  Elizabeth F. Churchill,et al.  Mouse tracking: measuring and predicting users' experience of web-based content , 2012, CHI.

[25]  Jeff Huang Web User Interaction Mining from Touch-Enabled Mobile Devices , 2012 .

[26]  Jane Li,et al.  Good abandonment in mobile and PC internet search , 2009, SIGIR.

[27]  Yiqun Liu,et al.  Different Users, Different Opinions: Predicting Search Satisfaction with Mouse Movement Information , 2015, SIGIR.

[28]  Klaus Krippendorff,et al.  Estimating the Reliability, Systematic Error and Random Error of Interval Data , 1970 .

[29]  Ryen W. White,et al.  No clicks, no problem: using cursor movements to understand and improve search , 2011, CHI.

[30]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[31]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[32]  J. Shane Culpepper,et al.  Including summaries in system evaluation , 2009, SIGIR.

[33]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[34]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[35]  Aleksandr Chuklin,et al.  Good abandonments in factoid queries , 2012, WWW.

[36]  Emine Yilmaz,et al.  Inferring document relevance from incomplete information , 2007, CIKM '07.