Standing on the Shoulders of Giant Frozen Language Models

condense relevant information from 100+ retrieved documents into the input sequence length of the frozen LM reader. We show that can reach and surpass leading fine tuning approaches on Natural Questions, a open-domain question answering benchmark.

[1]  Diego de Las Casas,et al.  Improving language models by retrieving from trillions of tokens , 2021, ICML.

[2]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[3]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[4]  Alexander M. Rush,et al.  Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[5]  Martin Jaggi,et al.  On the Relationship between Self-Attention and Convolutional Layers , 2019, ICLR.

[6]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[7]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[8]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[9]  M. Lewis,et al.  MetaICL: Learning to Learn In Context , 2021, NAACL.

[10]  Yoav Shoham,et al.  The Cost of Training NLP Models: A Concise Overview , 2020, ArXiv.

[11]  Percy Liang,et al.  Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[12]  Fabio Petroni,et al.  Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[13]  Feihu Jin,et al.  Instance-Aware Prompt Learning for Language Understanding and Generation , 2022, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[14]  Edouard Grave,et al.  Distilling Knowledge from Reader to Retriever for Question Answering , 2020, ArXiv.

[15]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[16]  Dani Yogatama,et al.  End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.

[17]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[18]  Colin Raffel,et al.  How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[19]  Wayne Xin Zhao,et al.  Context-Tuning: Learning Contextualized Prompts for Natural Language Generation , 2022 .

[20]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[21]  Olivier J. H'enaff,et al.  Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.

[22]  A. Shashua,et al.  The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design , 2021, ICLR.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[25]  Yelong Shen,et al.  LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.

[26]  Omer Levy,et al.  Learning to Retrieve Passages without Supervision , 2021, NAACL.

[27]  Edouard Grave,et al.  Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[28]  Brian Lester,et al.  SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer , 2021, ACL.

[29]  Md. Faisal Mahbub Chowdhury,et al.  Re2G: Retrieve, Rerank, Generate , 2022, NAACL.

[30]  Sanket Vaibhav Mehta,et al.  ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning , 2021, ArXiv.

[31]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[32]  Pavel Smrz,et al.  R2-D2: A Modular Baseline for Open-Domain Question Answering , 2021, EMNLP.

[33]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[34]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35]  Zhilin Yang,et al.  P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks , 2021, ArXiv.

[36]  Quoc V. Le,et al.  Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.

[37]  Brian Lester,et al.  The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.