Task Embeddings: Learning Query Embeddings using Task Context

Continuous space word embedding have been shown to be highly effective in many information retrieval tasks. Embedding representation models make use of local information available in immediately surrounding words to project nearby context words closer in the embedding space. With rising multi-tasking nature of web search sessions, users often try to accomplish different tasks in a single search session. Consequently, the search context gets polluted with queries from different unrelated tasks which renders the context heterogeneous. In this work, we hypothesize that task information provides better context for IR systems to learn from. We propose a novel task context embedding architecture to learn representation of queries in low-dimensional space by leveraging their task context information from historical search logs using neural embedding models. In addition to qualitative analysis, we empirically demonstrate the benefit of leveraging task context to learn query representations.

[1]  Parth Gupta,et al.  Query expansion for mixed-script information retrieval , 2014, SIGIR.

[2]  Chong Wang,et al.  Variational Inference for the Nested Chinese Restaurant Process , 2009, NIPS.

[3]  Ryen W. White,et al.  Modeling and analysis of cross-session search tasks , 2011, SIGIR.

[4]  Bhaskar Mitra,et al.  Query Auto-Completion for Rare Prefixes , 2015, CIKM.

[5]  Nick Craswell,et al.  Query Expansion with Locally-Trained Word Embeddings , 2016, ACL.

[6]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[7]  Wei Chu,et al.  Learning to extract cross-session search tasks , 2013, WWW.

[8]  Ryen W. White,et al.  Search, interrupted: understanding and predicting search task continuation , 2012, SIGIR '12.

[9]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[10]  Emine Yilmaz,et al.  Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach , 2017, SIGIR.

[11]  Marie-Francine Moens,et al.  Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings , 2015, SIGIR.

[12]  Ryen W. White,et al.  Predicting short-term interests using activity-based search context , 2010, CIKM.

[13]  Emine Yilmaz,et al.  Deconstructing Complex Search Tasks: a Bayesian Nonparametric Approach for Extracting Sub-tasks , 2016, NAACL.

[14]  Amanda Spink,et al.  Multitasking Web search on Vivisimo.com , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[15]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[16]  James P. Callan,et al.  Learning to Reweight Terms with Distributed Representations , 2015, SIGIR.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Emine Yilmaz,et al.  Characterizing Users' Multi-Tasking Behavior in Web Search , 2016, CHIIR.

[19]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.

[20]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[21]  Enhong Chen,et al.  Context-aware query classification , 2009, SIGIR.

[22]  Raymond J. Mooney,et al.  Learning to Disambiguate Search Queries from Short Sessions , 2009, ECML/PKDD.

[23]  Fabrizio Silvestri,et al.  Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search , 2015, SIGIR.

[24]  Xin Fu,et al.  The loquacious user: a document-independent source of terms for query expansion , 2005, SIGIR '05.

[25]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[26]  Bhaskar Mitra,et al.  Exploring Session Context using Distributed Representations of Queries and Reformulations , 2015, SIGIR.