Investigating Result Usefulness in Mobile Search

The existing evaluation approaches for search engines usually measure and estimate the utility or usefulness of search results by either the explicit relevance annotations from external assessors or implicit behavior signals from users. Because the mobile search is different from the desktop search in terms of the search tasks and the presentation styles of SERPs, whether the approaches originated from the desktop settings are still valid in the mobile scenario needs further investigation. To address this problem, we conduct a laboratory user study to record users’ search behaviors and collect their usefulness feedbacks for search results when using mobile devices. By analyzing the collected data, we investigate and characterize how the relevance, as well as the ranking position and presentation style of a result, affects its user-perceived usefulness level. A moderating effect of presentation style on the correlation between relevance and usefulness as well as a position bias affecting the usefulness in the initial viewport are identified. By correlating result-level usefulness feedbacks and relevance annotations with query-level satisfaction, we confirm the findings that usefulness feedbacks can better reflect user satisfaction than relevance annotations in mobile search. We also study the relationship between users’ usefulness feedbacks and their implicit search behavior, showing that the viewport features can be used to estimate usefulness when click signals are absent. Our study highlights the difference between desktop and mobile search and sheds light on developing a more user-centric evaluation method for mobile search.

[1]  Eugene Agichtein,et al.  Mining touch interaction data on mobile devices to predict web search result relevance , 2013, SIGIR.

[2]  Fan Zhang,et al.  Evaluating Mobile Search with Height-Biased Gain , 2017, SIGIR.

[3]  Tamás D. Gedeon,et al.  Eye‐tracking analysis of user behavior and performance in web search on large and small screens , 2015, J. Assoc. Inf. Sci. Technol..

[4]  Nicholas J. Belkin,et al.  A Model for Evaluation of Interactive Information Retrieval , 2009 .

[5]  J. Liu,et al.  Usefulness as the Criterion for Evaluation of Interactive Information Retrieval , 2009 .

[6]  Shumeet Baluja,et al.  A large scale study of wireless search behavior: Google mobile search , 2006, CHI.

[7]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[8]  Kevin Ong,et al.  Using Information Scent to Understand Mobile and Desktop Web Search Behavior , 2017, SIGIR.

[9]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[10]  Chih-Hung Hsieh,et al.  Towards better measurement of attention and satisfaction in mobile search , 2014, SIGIR.

[11]  Yiqun Liu,et al.  When does Relevance Mean Usefulness and User Satisfaction in Web Search? , 2016, SIGIR.

[12]  Jaime Teevan,et al.  Explicit In Situ User Feedback for Web Search Results , 2016, SIGIR.

[13]  Yang Song,et al.  Large-Scale Analysis of Viewing Behavior: Towards Measuring Satisfaction with Mobile Proactive Systems , 2016, CIKM.

[14]  Yang Song,et al.  Exploring and exploiting user search behavior on mobile and tablet devices to improve search relevance , 2013, WWW '13.

[15]  Yiqun Liu,et al.  Different Users, Different Opinions: Predicting Search Satisfaction with Mouse Movement Information , 2015, SIGIR.

[16]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[17]  Michael Keen,et al.  ASLIB CRANFIELD RESEARCH PROJECT FACTORS DETERMINING THE PERFORMANCE OF INDEXING SYSTEMS VOLUME 2 , 1966 .

[18]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[19]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[20]  Vidhya Navalpakkam,et al.  Understanding Mobile Searcher Attention with Rich Ad Formats , 2016, CIKM.

[21]  Milad Shokouhi,et al.  Mobile query reformulations , 2014, SIGIR.

[22]  Morgan Harvey,et al.  Searching on the Go: The Effects of Fragmented Attention on Mobile Web Search Tasks , 2017, SIGIR.

[23]  Emine Yilmaz,et al.  Characterizing Relevance on Mobile and Desktop , 2016, ECIR.

[24]  Thorsten Joachims,et al.  Eye-tracking analysis of user behavior in WWW search , 2004, SIGIR '04.

[25]  James Allan,et al.  Understanding Ephemeral State of Relevance , 2017, CHIIR.

[26]  Madian Khabsa,et al.  Detecting Good Abandonment in Mobile Search , 2016, WWW.