Learning to Truncate Ranked Lists for Information Retrieval

Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned documents according to some userdefined objectives, in order to reach a balance between the overall utility of the results and user efforts. Existing methods formulate this task as a sequential decision problem and take some pre-defined loss as a proxy objective, which suffers from the limitation of local decision and non-direct optimization. In this work, we propose a global decision based truncation model named AttnCut, which directly optimizes user-defined objectives for the ranked list truncation. Specifically, we take the successful transformer architecture to capture the global dependency within the ranked list for truncation decision, and employ the reward augmented maximum likelihood (RAML) for direct optimization. We consider two types of user-defined objectives which are of practical usage. One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search. Empirical results over the Robust04 and MQ2007 datasets demonstrate the effectiveness of our approach as compared with the state-of-the-art baselines.

[1]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[2]  Iraklis A. Klampanos Manning Christopher, Prabhakar Raghavan, Hinrich Schütze: Introduction to information retrieval , 2009, Information Retrieval.

[3]  Jingzhou Liu,et al.  Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML , 2017, ArXiv.

[4]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Steve Renals,et al.  Proceedings of the Ninth Text REtrieval Conference , 2001 .

[7]  A. Waibel,et al.  Toward Robust Neural Machine Translation for Noisy Input Sequences , 2017, IWSLT.

[8]  Dale Schuurmans,et al.  Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.

[9]  Emine Yilmaz,et al.  Research Frontiers in Information Retrieval Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018) , 2018 .

[10]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[11]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[12]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[13]  Jun Xu,et al.  Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[14]  Allan Hanbury,et al.  Patent Retrieval , 2013, Found. Trends Inf. Retr..

[15]  R. Manmatha,et al.  Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[16]  Siu Cheung Hui,et al.  Cross Temporal Recurrent Networks for Ranking Question Answer Pairs , 2017, AAAI.

[17]  Farzin Maghoul,et al.  Deciphering mobile search patterns: a study of Yahoo! mobile search queries , 2008, WWW.

[18]  Haoran Li,et al.  Ensure the Correctness of the Summary: Incorporate Entailment Knowledge into Abstractive Sentence Summarization , 2018, COLING.

[19]  Andrew Tomkins,et al.  Choppy: Cut Transformer for Ranked List Truncation , 2020, SIGIR.

[20]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[21]  Stephen E. Robertson,et al.  Where to stop reading a ranked list?: threshold optimization using truncated score distributions , 2009, SIGIR.

[22]  Eduard H. Hovy,et al.  From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction , 2018, ACL.

[23]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[24]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[25]  Avi Arampatzis,et al.  Incrementality, Half-life, and Threshold Optimization for Adaptive Document Filtering , 2000, TREC.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  Jimmy J. Lin,et al.  Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems , 2016, ADCS.

[28]  Mark Sanderson,et al.  Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press 2008. ISBN-13 978-0-521-86571-5, xxi + 482 pages , 2010, Natural Language Engineering.

[29]  Douglas W. Oard,et al.  Overview of the TREC 2007 Legal Track , 2007, TREC.

[30]  Avi Arampatzis,et al.  The score-distributional threshold optimization for adaptive binary classification tasks , 2001, SIGIR '01.

[31]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[32]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[33]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[34]  W. Bruce Croft,et al.  An Assumption-Free Approach to the Dynamic Truncation of Ranked Lists , 2019, ICTIR.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Avi Arampatzis,et al.  Unbiased S-D Threshold Optimization, Initial Query Degradation, Decay, and Incrementality, for Adaptive Document Filtering , 2001, TREC.

[37]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.