论文信息 - Standing on the Shoulders of Giant Frozen Language Models

Standing on the Shoulders of Giant Frozen Language Models

condense relevant information from 100+ retrieved documents into the input sequence length of the frozen LM reader. We show that can reach and surpass leading ﬁne tuning approaches on Natural Questions, a open-domain question answering benchmark.

[1] Diego de Las Casas,et al. Improving language models by retrieving from trillions of tokens , 2021, ICML.

[2] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[3] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[4] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[5] Martin Jaggi,et al. On the Relationship between Self-Attention and Convolutional Layers , 2019, ICLR.

[6] Mona Attariyan,et al. Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[7] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[8] Hugo Zaragoza,et al. The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[9] M. Lewis,et al. MetaICL: Learning to Learn In Context , 2021, NAACL.

[10] Yoav Shoham,et al. The Cost of Training NLP Models: A Concise Overview , 2020, ArXiv.

[11] Percy Liang,et al. Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[12] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[13] Feihu Jin,et al. Instance-Aware Prompt Learning for Language Understanding and Generation , 2022, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[14] Edouard Grave,et al. Distilling Knowledge from Reader to Retriever for Question Answering , 2020, ArXiv.

[15] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[16] Dani Yogatama,et al. End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.

[17] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[18] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[19] Wayne Xin Zhao,et al. Context-Tuning: Learning Contextualized Prompts for Natural Language Generation , 2022 .

[20] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[21] Olivier J. H'enaff,et al. Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.

[22] A. Shashua,et al. The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design , 2021, ICLR.

[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[25] Yelong Shen,et al. LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.

[26] Omer Levy,et al. Learning to Retrieve Passages without Supervision , 2021, NAACL.

[27] Edouard Grave,et al. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[28] Brian Lester,et al. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer , 2021, ACL.

[29] Md. Faisal Mahbub Chowdhury,et al. Re2G: Retrieve, Rerank, Generate , 2022, NAACL.

[30] Sanket Vaibhav Mehta,et al. ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning , 2021, ArXiv.

[31] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.

[32] Pavel Smrz,et al. R2-D2: A Modular Baseline for Open-Domain Question Answering , 2021, EMNLP.

[33] Andrea Vedaldi,et al. Learning multiple visual domains with residual adapters , 2017, NIPS.

[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35] Zhilin Yang,et al. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks , 2021, ArXiv.

[36] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.

[37] Brian Lester,et al. The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.