The Query Change Model

Modern information retrieval (IR) systems exhibit user dynamics through interactivity. These dynamic aspects of IR, including changes found in data, users, and systems, are increasingly being utilized in search engines. Session search is one such IR task—document retrieval within a session. During a session, a user constantly modifies queries to find documents that fulfill an information need. Existing IR techniques for assisting the user in this task are limited in their ability to optimize over changes, learn with a minimal computational footprint, and be responsive. This article proposes a novel query change retrieval model (QCM), which uses syntactic editing changes between consecutive queries, as well as the relationship between query changes and previously retrieved documents, to enhance session search. We propose modeling session search as a Markov decision process (MDP). We consider two agents in this MDP: the user agent and the search engine agent. The user agent’s actions are query changes that we observe, and the search engine agent’s actions are term weight adjustments as proposed in this work. We also investigate multiple query aggregation schemes and their effectiveness on session search. Experiments show that our approach is highly effective and outperforms top session search systems in TREC 2011 and TREC 2012.

[1]  Peter Bruza,et al.  Interactive Internet search: keyword, directory and query reformulation mechanisms compared , 2000, SIGIR '00.

[2]  Jun Wang,et al.  Sequential selection of correlated ads by POMDPs , 2012, CIKM.

[3]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[4]  Nicholas J. Belkin,et al.  Personalization of search results using interaction behaviors in search sessions , 2012, SIGIR '12.

[5]  Satinder Singh,et al.  Learning to Solve Markovian Decision Processes , 1993 .

[6]  Udo Kruschwitz,et al.  University of Essex at the TREC 2010 Session Track , 2010, TREC.

[7]  Aristides Gionis,et al.  Query similarity by projecting the query-flow graph , 2010, SIGIR.

[8]  Wei Chu,et al.  Learning to extract cross-session search tasks , 2013, WWW.

[9]  Shuguang Han,et al.  PITT at TREC 2011 Session Track , 2011, TREC.

[10]  Robert G. Capra,et al.  NSF workshop on task-based information search systems , 2013, SIGIR Forum.

[11]  Eugene Agichtein,et al.  Find it if you can: a game for modeling different types of web search success using interaction data , 2011, SIGIR.

[12]  Ben Carterette,et al.  Evaluating multi-query sessions , 2011, SIGIR.

[13]  Ben Carterette,et al.  Overview of the TREC 2012 Session Track , 2012, TREC.

[14]  Ryen W. White,et al.  Predicting short-term interests using activity-based search context , 2010, CIKM.

[15]  Eugene Agichtein,et al.  Ready to buy or just browsing?: detecting web searcher goals from interaction data , 2010, SIGIR.

[16]  Grace Hui Yang,et al.  Utilizing query change for session search , 2013, SIGIR.

[17]  Enhong Chen,et al.  Context-aware ranking in web search , 2010, SIGIR '10.

[18]  Paul N. Bennett,et al.  Toward whole-session relevance: exploring intrinsic diversity in web search , 2013, SIGIR.

[19]  R. Bellman A Markovian Decision Process , 1957 .

[20]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[21]  Chao Liu,et al.  Click chain model in web search , 2009, WWW '09.

[22]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[23]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[24]  Susan T. Dumais,et al.  To personalize or not to personalize: modeling queries with variation in user intent , 2008, SIGIR '08.

[25]  Ben Carterette,et al.  Session Track at TREC 2010 , 2010 .

[26]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[27]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[28]  Jacek Gwizdka,et al.  Analysis and evaluation of query reformulations in different task types , 2010, ASIST.

[29]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[30]  Hao Huang,et al.  BUPT_WILDCAT at TREC 2011 Session Track , 2011, TREC.

[31]  Wei Chu,et al.  Modeling the impact of short- and long-term behavior on search personalization , 2012, SIGIR '12.

[32]  Aristides Gionis,et al.  The query-flow graph: model and applications , 2008, CIKM '08.

[33]  Ryen W. White,et al.  Evaluating implicit feedback models using searcher simulations , 2005, TOIS.

[34]  Jun Wang,et al.  Interactive exploratory search for multi page search results , 2013, WWW.

[35]  Qiang Yang,et al.  Personalized click model through collaborative filtering , 2012, WSDM '12.

[36]  Grace Hui Yang,et al.  Increasing Stability of Result Organization for Session Search , 2013, ECIR.

[37]  Stephen E. Robertson,et al.  Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.

[38]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[39]  Marc-Allen Cartright,et al.  Intentions and attention in exploratory health search , 2011, SIGIR.

[40]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[41]  Yuchen Zhang,et al.  User-click modeling for understanding and predicting search-behavior , 2011, KDD.

[42]  Yiqun Liu,et al.  From Skimming to Reading: A Two-stage Examination Model for Web Search , 2014, CIKM.

[43]  Hang Li,et al.  A unified and discriminative model for query refinement , 2008, SIGIR '08.

[44]  Ben Carterette,et al.  Overview of the TREC 2013 Session Track , 2013, TREC.

[45]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[46]  Ben Carterette,et al.  Simulating simple user behavior for system effectiveness evaluation , 2011, CIKM '11.

[47]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[48]  Nicholas J. Belkin,et al.  Rutgers at the TREC 2012 Session Track , 2012, TREC.

[49]  Yang Song,et al.  Query suggestion by constructing term-transition graphs , 2012, WSDM '12.

[50]  Yannis Kalfoglou,et al.  Ontology mapping: the state of the art , 2003, The Knowledge Engineering Review.

[51]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[52]  Roberto Cornacchia,et al.  CWI at TREC 2011: Session, Web, and Medical , 2011, TREC.

[53]  Charles L. A. Clarke,et al.  Efficient and effective spam filtering and re-ranking for large web datasets , 2010, Information Retrieval.

[54]  Yang Song,et al.  Optimal rare query suggestion with implicit user feedback , 2010, WWW '10.

[55]  Nicholas J. Belkin,et al.  Personalizing information retrieval for multi-session tasks: the roles of task stage and task type , 2010, SIGIR '10.

[56]  Ben Carterette,et al.  Overview of the TREC 2011 Session Track , 2011, TREC.

[57]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[58]  Ian Ruthven,et al.  Interactive information retrieval , 2008 .

[59]  Grace Hui Yang,et al.  Effective Structured Query Formulation for Session Search , 2012, TREC.

[60]  Di Jiang,et al.  Context-aware search personalization with concept preference , 2011, CIKM '11.

[61]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[62]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.