Conformer-Kernel with Query Term Independence at TREC 2020 Deep Learning Track

We benchmark Conformer-Kernel models under the strict blind evaluation setting of the TREC 2020 Deep Learning track. In particular, we study the impact of incorporating: (i) Explicit term matching to complement matching based on learned representations (i.e., the "Duet principle"), (ii) query term independence (i.e., the "QTI assumption") to scale the model to the full retrieval setting, and (iii) the ORCAS click data as an additional document description field. We find evidence which supports that all three aforementioned strategies can lead to improved retrieval quality.

[1]  Bhaskar Mitra,et al.  Neural Ranking Models with Multiple Document Fields , 2017, WSDM.

[2]  Bhaskar Mitra,et al.  Overview of the TREC 2019 deep learning track , 2020, ArXiv.

[3]  Bhaskar Mitra,et al.  ORCAS: 20 Million Clicked Query-Document Pairs for Analyzing Search , 2020, CIKM.

[4]  Ye Li,et al.  Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , 2020, ArXiv.

[5]  Dirk Krechel,et al.  CoRT: Complementary Rankings from Transformers , 2020, ArXiv.

[6]  W. Bruce Croft,et al.  A Deep Look into Neural Ranking Models for Information Retrieval , 2019, Inf. Process. Manag..

[7]  Allan Hanbury,et al.  Local Self-Attention over Long Text for Efficient Document Retrieval , 2020, SIGIR.

[8]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[9]  Bhaskar Mitra,et al.  Benchmark for Complex Answer Retrieval , 2017, ICTIR.

[10]  Bhaskar Mitra,et al.  Incorporating Query Term Independence Assumption for Efficient Retrieval and Ranking using Deep Neural Networks , 2019, ArXiv.

[11]  Emine Yilmaz,et al.  On the Reliability of Test Collections for Evaluating Systems of Different Types , 2020, SIGIR.

[12]  Jimmy J. Lin,et al.  Multi-Stage Document Ranking with BERT , 2019, ArXiv.

[13]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[14]  Bhaskar Mitra,et al.  An Updated Duet Model for Passage Re-ranking , 2019, ArXiv.

[15]  J. Shane Culpepper,et al.  Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval , 2017, SIGIR.

[16]  Allan Hanbury,et al.  TU Wien @ TREC Deep Learning '19 - Simple Contextualization for Re-ranking , 2019, TREC.

[17]  Nick Craswell Mean Reciprocal Rank , 2009, Encyclopedia of Database Systems.

[18]  J. Shane Culpepper,et al.  Joint Optimization of Cascade Ranking Models , 2019, WSDM.

[19]  Bhaskar Mitra,et al.  An Introduction to Neural Information Retrieval , 2018, Found. Trends Inf. Retr..

[20]  Bhaskar Mitra,et al.  Optimizing Query Evaluations Using Reinforcement Learning for Web Search , 2018, SIGIR.

[21]  Nick Craswell,et al.  Duet at Trec 2019 Deep Learning Track , 2019, TREC.

[22]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[23]  Hamed Zamani,et al.  Conformer-Kernel with Query Term Independence for Document Retrieval , 2020, ArXiv.

[24]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[25]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[26]  W. Bruce Croft,et al.  From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing , 2018, CIKM.

[27]  Michael Bendersky,et al.  Leveraging Semantic and Lexical Matching to Improve the Recall of Document Retrieval Systems: A Hybrid Approach , 2020, ArXiv.

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Bhaskar Mitra,et al.  A Dual Embedding Space Model for Document Ranking , 2016, ArXiv.

[30]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[31]  Luyu Gao,et al.  Complementing Lexical Retrieval with Semantic Residual Embedding , 2020, ArXiv.