Corpus-Level End-to-End Exploration for Interactive Systems

A core interest in building Artificial Intelligence (AI) agents is to let them interact with and assist humans. One example is Dynamic Search (DS), which models the process that a human works with a search engine agent to accomplish a complex and goal-oriented task. Early DS agents using Reinforcement Learning (RL) have only achieved limited success for (1) their lack of direct control over which documents to return and (2) the difficulty to recover from wrong search trajectories. In this paper, we present a novel corpus-level end-to-end exploration (CE3) method to address these issues. In our method, an entire text corpus is compressed into a global low-dimensional representation, which enables the agent to gain access to the full state and action spaces, including the under-explored areas. We also propose a new form of retrieval function, whose linear approximation allows end-to-end manipulation of documents. Experiments on the Text REtrieval Conference (TREC) Dynamic Domain (DD) Track show that CE3 outperforms the state-of-the-art DS systems.

[1]  Yujing Hu,et al.  Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.

[2]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[3]  Cheng Li,et al.  Multiple Queries as Bandit Arms , 2016, CIKM.

[4]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[5]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Jun Wang,et al.  Dynamic Information Retrieval Modeling , 2015, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[8]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[9]  Grace Hui Yang,et al.  TREC 2016 Dynamic Domain Track Overview , 2016, TREC.

[10]  Massimo Melucci,et al.  Evaluation of a Feedback Algorithm inspired by Quantum Detection for Dynamic Search Tasks , 2016, TREC.

[11]  Jun Xu,et al.  Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[12]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[13]  Yue Liu,et al.  ICTNET at TREC 2017 Dynamic Domain Track , 2017, TREC.

[14]  Rodrygo L. T. Santos,et al.  UFMG at the TREC 2016 Dynamic Domain track , 2016, TREC.

[15]  Nicholas J. Belkin,et al.  Interaction with Texts: Information Retrieval as Information-Seeking Behavior , 1993, Information Retrieval.

[16]  Christopher Joseph Pal,et al.  Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study , 2019, ACL.

[17]  Jianfeng Gao,et al.  Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.

[18]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[19]  Geoffrey E. Hinton,et al.  Visualizing Similarity Data with a Mixture of Maps , 2007, AISTATS.

[20]  Craig MacDonald,et al.  Explicit Search Result Diversification through Sub-queries , 2010, ECIR.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Sylvain Daronnat,et al.  Human-agent collaborations : trust in negotiating control , 2019 .

[23]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[24]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[25]  Grace Hui Yang,et al.  DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval , 2019, AAAI.

[26]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[27]  Douglas W. Oard,et al.  UMD_CLIP: Using Relevance Feedback to Find Diverse Documents for TREC Dynamic Domain 2017 , 2017, TREC.

[28]  Paul Over,et al.  Comparing interactive information retrieval systems across sites: the TREC-6 interactive track matrix experiment , 1998, SIGIR '98.

[29]  Grace Hui Yang,et al.  Win-win search: dual-agent stochastic game in session search , 2014, SIGIR.

[30]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[31]  Yue Liu,et al.  ICTNET at Session Track TREC2014 , 2014, TREC.

[32]  Ludovic Denoyer,et al.  A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems , 2018, SCAI@EMNLP.

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[34]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[35]  Lois M. L. Delcambre,et al.  Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions , 2008, ECIR.

[36]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[37]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search: Beyond the Query-Response Paradigm.

[38]  Grace Hui Yang,et al.  A Reinforcement Learning Approach for Dynamic Search , 2017, TREC.

[39]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[40]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[41]  Grace Hui Yang,et al.  Session Search by Direct Policy Learning , 2015, ICTIR.

[42]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.