Choppy: Cut Transformer for Ranked List Truncation

Work in information retrieval has traditionally focused on ranking and relevance: given a query, return some number of results ordered by relevance to the user. However, the problem of determining how many results to return, i.e. how to optimally truncate the ranked result list, has received less attention despite being of critical importance in a range of applications. Such truncation is a balancing act between the overall relevance, or usefulness of the results, with the user cost of processing more results. In this work, we propose Choppy, an assumption-free model based on the widely successful Transformer architecture, to the ranked list truncation problem. Needing nothing more than the relevance scores of the results, the model uses a powerful multi-head attention mechanism to directly optimize any user-defined IR metric. We show Choppy improves upon recent state-of-the-art methods.

[1]  Andrei Z. Broder,et al.  To swing or not to swing: learning when (not) to advertise , 2008, CIKM '08.

[2]  W. Bruce Croft,et al.  An Assumption-Free Approach to the Dynamic Truncation of Ranked Lists , 2019, ICTIR.

[3]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[4]  Jie Tang,et al.  Learning to Advertise: How Many Ads Are Enough? , 2011, PAKDD.

[5]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[6]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[7]  Djoerd Hiemstra,et al.  A survey of pre-retrieval query performance predictors , 2008, CIKM '08.

[8]  J. Shane Culpepper,et al.  Neural Query Performance Prediction using Weak Supervision from Multiple Signals , 2018, SIGIR.

[9]  Douglas W. Oard,et al.  Overview of the TREC 2007 Legal Track , 2007, TREC.

[10]  Javed A. Aslam,et al.  Relevance score normalization for metasearch , 2001, CIKM '01.

[11]  Stephen E. Robertson,et al.  Where to stop reading a ranked list?: threshold optimization using truncated score distributions , 2009, SIGIR.

[12]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[13]  R. Manmatha,et al.  Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[16]  Milad Shokouhi,et al.  Federated Search , 2011, Found. Trends Inf. Retr..

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Jimmy J. Lin,et al.  Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems , 2016, ADCS.