Deconstructing Complex Search Tasks: a Bayesian Nonparametric Approach for Extracting Sub-tasks

Search tasks, comprising a series of search queries serving a common informational need, have steadily emerged as accurate units for developing the next generation of task-aware web search systems. Most prior research in this area has focused on segmenting chronologically ordered search queries into higher level tasks. A more naturalistic viewpoint would involve treating query logs as convoluted structures of tasks-subtasks, with complex search tasks being decomposed into more focused sub-tasks. In this work, we focus on extracting sub-tasks from a given collection of on-task search queries. We jointly leverage insights from Bayesian nonparametrics and word embeddings to identify and extract sub-tasks from a given collection of ontask queries. Our proposed model can inform the design of the next generation of task-based search systems that leverage user’s task behavior for better support and personalization.

[1]  Matthias Hagen,et al.  From search session detection to search mission detection , 2013, OAIR.

[2]  Francesco Bonchi,et al.  Do you want to take notes?: identifying research missions in Yahoo! search pad , 2010, WWW '10.

[3]  Emine Yilmaz,et al.  Entity Oriented Task Extraction from Query Logs , 2014, CIKM.

[4]  Emine Yilmaz,et al.  Task-Based User Modelling for Personalization via Probabilistic Matrix Factorization , 2014, RecSys Posters.

[5]  Emine Yilmaz,et al.  Characterizing Users' Multi-Tasking Behavior in Web Search , 2016, CHIIR.

[6]  Scott Sanner,et al.  Improving LDA topic models for microblogs via tweet pooling and automatic labeling , 2013, SIGIR.

[7]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[8]  Ryen W. White,et al.  Search, interrupted: understanding and predicting search task continuation , 2012, SIGIR '12.

[9]  Hongbo Deng,et al.  Identifying and labeling search tasks via query-based hawkes processes , 2014, KDD.

[10]  Amanda Spink,et al.  Multitasking Web search on , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[11]  Yang Song,et al.  Evaluating the effectiveness of search task trails , 2012, WWW.

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Dan Morris,et al.  SearchBar: a search-centric web history for task resumption and information re-finding , 2008, CHI.

[14]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[15]  Wei Chu,et al.  Learning to extract cross-session search tasks , 2013, WWW.

[16]  Robert G. Capra,et al.  NSF workshop on task-based information search systems , 2013, SIGIR Forum.

[17]  Emine Yilmaz,et al.  Towards Hierarchies of Search Tasks & Subtasks , 2015, WWW.

[18]  Ryen W. White,et al.  Modeling and analysis of cross-session search tasks , 2011, SIGIR.

[19]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[20]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Chong Wang,et al.  Variational Inference for the Nested Chinese Restaurant Process , 2009, NIPS.

[23]  Emine Yilmaz,et al.  Terms, Topics & Tasks: Enhanced User Modelling for Better Personalization , 2015, ICTIR.

[24]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.