Continual Learning for Generative Retrieval over Dynamic Corpora

Generative retrieval (GR) directly predicts the identifiers of relevant documents (i.e., docids) based on a parametric model. It has achieved solid performance on many ad-hoc retrieval tasks. So far, these tasks have assumed a static document collection. In many practical scenarios, however, document collections are dynamic, where new documents are continuously added to the corpus. The ability to incrementally index new documents while preserving the ability to answer queries with both previously and newly indexed relevant documents is vital to applying GR models. In this paper, we address this practical continual learning problem for GR. We put forward a novel Continual-LEarner for generatiVE Retrieval (CLEVER) model and make two major contributions to continual learning for GR: (i) To encode new documents into docids with low computational cost, we present Incremental Product Quantization, which updates a partial quantization codebook according to two adaptive thresholds; and (ii) To memorize new documents for querying without forgetting previous knowledge, we propose a memory-augmented learning mechanism, to form meaningful connections between old and new documents. Empirical results demonstrate the effectiveness and efficiency of the proposed model.

[1]  J. Guo,et al.  Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies , 2023, KDD.

[2]  M. de Rijke,et al.  A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning , 2023, SIGIR.

[3]  Ledell Yu Wu,et al.  DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index , 2023, Machine Intelligence Research.

[4]  Sanket Vaibhav Mehta,et al.  DSI++: Updating Transformer Memory with New Documents , 2022, EMNLP.

[5]  Wayne Xin Zhao,et al.  Dense Text Retrieval Based on Pretrained Language Models: A Survey , 2022, ACM Trans. Inf. Syst..

[6]  Ledell Yu Wu,et al.  Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer , 2022, ArXiv.

[7]  J. Guo,et al.  CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks , 2022, CIKM.

[8]  Daxin Jiang,et al.  Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation , 2022, ArXiv.

[9]  Qi Zhang,et al.  A Neural Corpus Indexer for Document Retrieval , 2022, NeurIPS.

[10]  J. Guo,et al.  Pre-train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction , 2022, SIGIR.

[11]  Wen-tau Yih,et al.  Autoregressive Search Engines: Generating Substrings as Document Identifiers , 2022, NeurIPS.

[12]  J. Guo,et al.  GERE: Generative Evidence Retrieval for Fact Verification , 2022, SIGIR.

[13]  Liang Pang,et al.  Match-Prompt: Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning , 2022, CIKM.

[14]  William W. Cohen,et al.  Transformer Memory as a Differentiable Search Index , 2022, NeurIPS.

[15]  Ruocheng Guo,et al.  Graph Few-shot Class-incremental Learning , 2021, Web Search and Data Mining.

[16]  Alfio Gliozzo,et al.  Robust Retrieval Augmented Generation for Zero-shot Slot Filling , 2021, EMNLP.

[17]  P. Chaudhari,et al.  Model Zoo: A Growing Brain That Learns Continually , 2021, ICLR.

[18]  Xueqi Cheng,et al.  B-PROP: Bootstrapped Pre-training with Representative Words Prediction for Ad-hoc Retrieval , 2021, SIGIR.

[19]  Jiafeng Guo,et al.  Optimizing Dense Retrieval Model Training with Hard Negatives , 2021, SIGIR.

[20]  Hyunwoo J. Kim,et al.  Online Continual Learning in Image Classification: An Empirical Survey , 2021, Neurocomputing.

[21]  Fabio Petroni,et al.  Multi-Task Retrieval for Knowledge-Intensive Tasks , 2021, ACL.

[22]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Xueqi Cheng,et al.  PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval , 2020, WSDM.

[24]  Nicola De Cao,et al.  Autoregressive Entity Retrieval , 2020, ICLR.

[25]  Xiaopeng Hong,et al.  Few-Shot Class-Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Mike Thelwall,et al.  COVID-19 publications: Database coverage, citations, readers, tweets, news, Facebook walls, Reddit posts , 2020, Quantitative Science Studies.

[27]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[28]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[29]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[31]  Tinne Tuytelaars,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[33]  Jimmy J. Lin,et al.  Document Expansion by Query Prediction , 2019, ArXiv.

[34]  Marc'Aurelio Ranzato,et al.  On Tiny Episodic Memories in Continual Learning , 2019 .

[35]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[36]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[37]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[38]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[39]  Tri Minh Nguyen,et al.  MS MARCO: A Human Generated MAchine Reading COmprehension Dataset , 2016 .

[40]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jianfeng Gao,et al.  MS MARCO: A Human Generated MAchine Reading COmprehension Dataset , 2016, CoCo@NIPS.

[43]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[44]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[45]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[47]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[48]  Christopher D. Manning,et al.  Introduction to information retrieval , 2008 .

[49]  I. J. Myung,et al.  Tutorial on maximum likelihood estimation , 2003 .

[50]  P. Danielsson Euclidean distance mapping , 1980 .

[51]  R. Freedle Discourse production and comprehension , 1978 .

[52]  H. H. Clark,et al.  What's new? Acquiring New information as a process in comprehension , 1974 .

[53]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[54]  OctoMiao Overcoming catastrophic forgetting in neural networks , 2016 .

[55]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Ian H. Witten,et al.  How to Build a Digital Library , 2002 .