Multi-Stage Document Ranking with BERT

The advent of deep neural networks pre-trained via language modeling tasks has spurred a number of successful applications in natural language processing. This work explores one such popular model, BERT, in the context of document ranking. We propose two variants, called monoBERT and duoBERT, that formulate the ranking problem as pointwise and pairwise classification, respectively. These two models are arranged in a multi-stage ranking architecture to form an end-to-end search system. One major advantage of this design is the ability to trade off quality against latency by controlling the admission of candidates into each pipeline stage, and by doing so, we are able to find operating points that offer a good balance between these two competing metrics. On two large-scale datasets, MS MARCO and TREC CAR, experiments show that our model produces results that are either at or comparable to the state of the art. Ablation studies show the contributions of each component and characterize the latency/quality tradeoff space.

[1]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[2]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3]  Jimmy J. Lin,et al.  Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models , 2019, SIGIR.

[4]  J. Shane Culpepper,et al.  Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval , 2017, SIGIR.

[5]  J. Shane Culpepper,et al.  Query Driven Algorithm Selection in Early Stage Retrieval , 2018, WSDM.

[6]  Gerard de Melo,et al.  Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval , 2017, WSDM.

[7]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[8]  Jimmy J. Lin,et al.  End-to-End Open-Domain Question Answering with BERTserini , 2019, NAACL.

[9]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[10]  Nazli Goharian,et al.  CEDR: Contextualized Embeddings for Document Ranking , 2019, SIGIR.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Berkant Barla Cambazoglu,et al.  Early exit optimizations for additive machine learned ranking systems , 2010, WSDM '10.

[13]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[14]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[15]  Jimmy J. Lin,et al.  Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures , 2013, SIGIR.

[16]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[17]  Matthew Henderson,et al.  Efficient Natural Language Response Suggestion for Smart Reply , 2017, ArXiv.

[18]  Jimmy J. Lin,et al.  Old dogs are great at new tricks: column stores for ir prototyping , 2014, SIGIR.

[19]  Bhaskar Mitra,et al.  An Introduction to Neural Information Retrieval , 2018, Found. Trends Inf. Retr..

[20]  Kyle Lo,et al.  SciBERT: Pretrained Contextualized Embeddings for Scientific Text , 2019, ArXiv.

[21]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[22]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[23]  Jimmy J. Lin,et al.  The Neural Hype and Comparisons Against Weak Baselines , 2019, SIGIR Forum.

[24]  Ellen M. Voorhees,et al.  The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[25]  Md. Mustafizur Rahman,et al.  Neural information retrieval: at the end of the early years , 2017, Information Retrieval Journal.

[26]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[27]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[28]  Jimmy J. Lin,et al.  Anserini: Enabling the Use of Lucene for Information Retrieval Research , 2017, SIGIR.

[29]  Jimmy J. Lin,et al.  Anserini , 2018, Journal of Data and Information Quality.

[30]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[31]  Kyunghyun Cho,et al.  Passage Re-ranking with BERT , 2019, ArXiv.

[32]  Javed A. Aslam,et al.  Condorcet fusion for improved retrieval , 2002, CIKM '02.

[33]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[34]  Luo Si,et al.  Cascade Ranking for Operational E-commerce Search , 2017, KDD.

[35]  W. Bruce Croft,et al.  From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing , 2018, CIKM.

[36]  Jimmy J. Lin,et al.  Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval , 2019, EMNLP.

[37]  Filip Radlinski,et al.  TREC Complex Answer Retrieval Overview , 2018, TREC.

[38]  Andrew Yates,et al.  Contextualized PACRR for Complex Answer Retrieval , 2017, TREC.

[39]  Laura Dietz,et al.  TREMA-UNH at TREC 2018: Complex Answer Retrieval and News Track , 2018, TREC.

[40]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[41]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[42]  Charles L. A. Clarke,et al.  Reciprocal rank fusion outperforms condorcet and individual rank learning methods , 2009, SIGIR.

[43]  Jimmy J. Lin,et al.  Simple Applications of BERT for Ad Hoc Document Retrieval , 2019, ArXiv.

[44]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[45]  Tao Yang,et al.  Efficient Interaction-based Neural Ranking with Locality Sensitive Hashing , 2019, WWW.

[46]  Craig MacDonald,et al.  Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge , 2016, ECIR.