Search, interrupted: understanding and predicting search task continuation

Many important search tasks require multiple search sessions to complete. Tasks such as travel planning, large purchases, or job searches can span hours, days, or even weeks. Inevitably, life interferes, requiring the searcher either to recover the "state" of the search manually (most common), or plan for interruption in advance (unlikely). The goal of this work is to better understand, characterize, and automatically detect search tasks that will be continued in the near future. To this end, we analyze a query log from the Bing Web search engine to identify the types of intents, topics, and search behavior patterns associated with long-running tasks that are likely to be continued. Using our insights, we develop an effective prediction algorithm that significantly outperforms both the previous state-of-the-art method, and even the ability of human judges, to predict future task continuation. Potential applications of our techniques would allow a search engine to pre-emptively "save state" for a searcher (e.g., by caching search results), perform more targeted personalization, and otherwise better support the searcher experience for interrupted search tasks.

[1]  Susan T. Dumais,et al.  Classification-enhanced ranking , 2010, WWW '10.

[2]  Bonnie MacKay,et al.  Exploring multi-session web tasks , 2008, CHI.

[3]  Carolyn Watters,et al.  A field study characterizing Web-based information-seeking tasks , 2007 .

[4]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[5]  Justus J. Randolph Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa. , 2005 .

[6]  Susan T. Dumais,et al.  Individual differences in gaze patterns for web search , 2010, IIiX.

[7]  Filippo Menczer,et al.  Behavior-driven clustering of queries into topics , 2011, CIKM '11.

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  Dan Morris,et al.  SearchBar: a search-centric web history for task resumption and information re-finding , 2008, CHI.

[10]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[11]  Ryen W. White,et al.  Modeling and analysis of cross-session search tasks , 2011, SIGIR.

[12]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[13]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .

[14]  Ravi Kumar,et al.  An analysis framework for search sequences , 2009, CIKM.

[15]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[16]  Nicholas J. Belkin,et al.  Personalizing information retrieval for multi-session tasks: the roles of task stage and task type , 2010, SIGIR '10.

[17]  Francesco Bonchi,et al.  Do you want to take notes?: identifying research missions in Yahoo! search pad , 2010, WWW '10.

[18]  Susan T. Dumais,et al.  Evaluating implicit measures to improve the search experiences , 2003 .

[19]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[20]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[21]  Enhong Chen,et al.  Context-aware ranking in web search , 2010, SIGIR '10.

[22]  Ryen W. White,et al.  Effects of search success on search engine re-use , 2011, CIKM '11.

[23]  Mika Käki,et al.  Information search and re-access strategies of experienced web users , 2005, WWW '05.

[24]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[25]  Yang Song,et al.  A task level metric for measuring web search satisfaction and its application on improving relevance estimation , 2011, CIKM '11.

[26]  Ahmed Hassan Awadallah,et al.  Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.

[27]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[28]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[29]  Amanda Spink,et al.  Multitasking information behavior and information task switching: an exploratory study , 2004, J. Documentation.

[30]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[31]  Jaime Teevan,et al.  Information re-retrieval: repeat queries in Yahoo's logs , 2007, SIGIR.

[32]  ChengXiang Zhai,et al.  Mining long-term search history to improve search accuracy , 2006, KDD '06.

[33]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[34]  Elaine Toms,et al.  The effect of task domain on search , 2003, CASCON.

[35]  Ryen W. White,et al.  Probabilistic models for personalizing web search , 2012, WSDM '12.

[36]  Mary Czerwinski,et al.  A diary study of task switching and interruptions , 2004, CHI.

[37]  Filip Radlinski,et al.  Inferring query intent from reformulations and clicks , 2010, WWW '10.

[38]  Benjamin Piwowarski,et al.  Predictive user click models based on click-through history , 2007, CIKM '07.

[39]  J. T. Austin,et al.  Goal constructs in psychology: Structure, process, and content. , 1996 .

[40]  Wen-Hsiang Lu,et al.  Identifying User Goals from Web Search Results , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[41]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[42]  Nicholas J. Belkin,et al.  A faceted approach to conceptualizing tasks in information seeking , 2008, Inf. Process. Manag..

[43]  Ryen W. White,et al.  Predicting user interests from contextual information , 2009, SIGIR.