Session Search by Direct Policy Learning

This paper proposes a novel retrieval model for session search. Through gradient descent, the model finds optimal policies for the best search engine actions from what is observed in the user and search engine interactions. The proposed framework applies direct policy learning to session search such that it greatly reduce the model complexity than prior work. It is also a flexible design, which includes a wide range of features describing the rich interactions in session search. The framework is shown to be highly effective evaluated on the recent TREC Session Tracks. As part of the efforts to bring reinforcement learning to information retrieval, this paper makes a novel contribution in theoretical modeling for session search.

[1]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[2]  Aristides Gionis,et al.  The query-flow graph: model and applications , 2008, CIKM '08.

[3]  Ryen W. White,et al.  Evaluating implicit feedback models using searcher simulations , 2005, TOIS.

[4]  Shuguang Han,et al.  PITT at TREC 2011 Session Track , 2011, TREC.

[5]  Ahmed Hassan Awadallah,et al.  Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.

[6]  Ryen W. White,et al.  Modeling dwell time to predict click-level satisfaction , 2014, WSDM.

[7]  Eugene Agichtein,et al.  Find it if you can: a game for modeling different types of web search success using interaction data , 2011, SIGIR.

[8]  Katja Hofmann,et al.  Balancing Exploration and Exploitation in Learning to Rank Online , 2011, ECIR.

[9]  Jaime Teevan,et al.  Understanding how people interact with web search results that change in real-time using implicit feedback , 2013, CIKM.

[10]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[11]  Luke S. Zettlemoyer,et al.  Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[12]  Grace Hui Yang,et al.  Win-win search: dual-agent stochastic game in session search , 2014, SIGIR.

[13]  Grace Hui Yang,et al.  Utilizing query change for session search , 2013, SIGIR.

[14]  Yang Song,et al.  Optimal rare query suggestion with implicit user feedback , 2010, WWW '10.

[15]  Nicholas J. Belkin,et al.  Personalizing information retrieval for multi-session tasks: the roles of task stage and task type , 2010, SIGIR '10.

[16]  Ryen W. White,et al.  Lessons from the journey: a query log analysis of within-session learning , 2014, WSDM.

[17]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[18]  Ben Carterette,et al.  Overview of the TREC 2014 Session Track , 2014, TREC.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Jürgen Schmidhuber,et al.  Sequential Decision Making Based on Direct Search , 2001, Sequence Learning.

[21]  Ryen W. White,et al.  Personalizing web search results by reading level , 2011, CIKM '11.

[22]  Paul N. Bennett,et al.  Toward whole-session relevance: exploring intrinsic diversity in web search , 2013, SIGIR.

[23]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[24]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[25]  Ryen W. White,et al.  Modeling and analysis of cross-session search tasks , 2011, SIGIR.

[26]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[27]  Ben Carterette,et al.  Overview of the TREC 2013 Session Track , 2013, TREC.

[28]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[29]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..