Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models

The pre-trained language model (eg, BERT) based deep retrieval models achieved superior performance over lexical retrieval models (eg, BM25) in many passage retrieval tasks. However, limited work has been done to generalize a deep retrieval model to other tasks and domains. In this work, we carefully select five datasets, including two in-domain datasets and three out-of-domain datasets with different levels of domain shift, and study the generalization of a deep model in a zero-shot setting. Our findings show that the performance of a deep retrieval model is significantly deteriorated when the target domain is very different from the source domain that the model was trained on. On the contrary, lexical models are more robust across domains. We thus propose a simple yet effective framework to integrate lexical and deep retrieval models. Our experiments demonstrate that these two models are complementary, even when the deep model is weaker in the out-of-domain setting. The hybrid model obtains an average of 20.4% relative gain over the deep retrieval model, and an average of 9.54% over the lexical model in three out-of-domain datasets.

[1]  Craig MacDonald,et al.  From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[2]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[5]  D. Cheriton From doc2query to docTTTTTquery , 2019 .

[6]  Jimmy J. Lin,et al.  In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval , 2021, REPL4NLP.

[7]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[8]  Ye Li,et al.  Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , 2020, ArXiv.

[9]  Michael Bendersky,et al.  Leveraging Semantic and Lexical Matching to Improve the Recall of Document Retrieval Systems: A Hybrid Approach , 2020, ArXiv.

[10]  Jamie Callan,et al.  Context-Aware Term Weighting For First Stage Passage Retrieval , 2020, SIGIR.

[11]  Charles L. A. Clarke,et al.  Reciprocal rank fusion outperforms condorcet and individual rank learning methods , 2009, SIGIR.

[12]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[13]  Oren Etzioni,et al.  CORD-19: The Covid-19 Open Research Dataset , 2020, NLPCOVID19.

[14]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[15]  Benjamin Van Durme,et al.  Complement Lexical Retrieval Model with Semantic Residual Embeddings , 2021, ECIR.

[16]  Shuguang Han,et al.  RRF102: Meeting the TREC-COVID Challenge with a 100+ Runs Ensemble , 2020, ArXiv.

[17]  Jimmy J. Lin,et al.  A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques , 2021, ArXiv.

[18]  Yiqun Liu,et al.  RepBERT: Contextualized Text Embeddings for First-Stage Retrieval , 2020, ArXiv.

[19]  Gustavo Hernández Ábrego,et al.  Multi-stage Training with Improved Negative Contrast for Neural Passage Retrieval , 2021, EMNLP.

[20]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Robust Retrieval Track , 2004 .

[21]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[22]  Sanjiv Kumar,et al.  Accelerating Large-Scale Inference with Anisotropic Vector Quantization , 2019, ICML.

[23]  Hua Wu,et al.  RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering , 2020, NAACL.

[24]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[25]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[26]  Jacob Eisenstein,et al.  Sparse, Dense, and Attentional Representations for Text Retrieval , 2021, Transactions of the Association for Computational Linguistics.

[27]  Nick Craswell,et al.  ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search , 2020, CIKM.

[28]  Tao Tao,et al.  Language Model Information Retrieval with Document Expansion , 2006, NAACL.

[29]  Fernando Diaz,et al.  UMass at TREC 2004: Novelty and HARD , 2004, TREC.

[30]  Luyu Gao,et al.  Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval , 2021, ACL.

[31]  Zexuan Zhong,et al.  Simple Entity-Centric Questions Challenge Dense Retrievers , 2021, EMNLP.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Jamie Callan,et al.  Deeper Text Understanding for IR with Contextual Neural Language Modeling , 2019, SIGIR.

[34]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[35]  Yelong Shen,et al.  Generation-Augmented Retrieval for Open-Domain Question Answering , 2020, ACL.

[36]  Jimmy J. Lin,et al.  A Replication Study of Dense Passage Retriever , 2021, ArXiv.

[37]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[38]  Kirk Roberts,et al.  TREC-COVID: rationale and structure of an information retrieval shared task for COVID-19 , 2020, J. Am. Medical Informatics Assoc..

[39]  Zhuyun Dai,et al.  Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval , 2019, ArXiv.

[40]  Guido Zuccon,et al.  BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval , 2021, ICTIR.

[41]  Jimmy J. Lin,et al.  The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models , 2021, ArXiv.

[42]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.