论文信息 - Learning Dense Representations of Phrases at Scale - 字舞流文

Learning Dense Representations of Phrases at Scale

Open-domain question answering can be reformulated as a phrase retrieval problem, without the need for processing documents on-demand during inference (Seo et al., 2019). However, current phrase retrieval models heavily depend on their sparse representations while still underperforming retriever-reader approaches. In this work, we show for the first time that we can learn dense phrase representations alone that achieve much stronger performance in open-domain QA. Our approach includes (1) learning query-agnostic phrase representations via question generation and distillation; (2) novel negative-sampling methods for global normalization; (3) query-side fine-tuning for transfer learning. On five popular QA datasets, our model DensePhrases improves previous phrase retrieval models by 15%–25% absolute accuracy and matches the performance of state-of-the-art retriever-reader models. Our model is easy to parallelize due to pure dense representations and processes more than 10 questions per second on CPUs. Finally, we directly use our pre-indexed dense phrase representations for two slot filling tasks, showing the promise of utilizing DensePhrases as a dense knowledge base for downstream tasks.1

Danqi Chen | Jaewoo Kang | Mujeen Sung | Jinhyuk Lee

[1] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[2] Nicola De Cao,et al. KILT: a Benchmark for Knowledge Intensive Language Tasks , 2020, ArXiv.

[3] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[4] Niranjan Balasubramanian,et al. DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering , 2020, ACL.

[5] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[6] Jimmy J. Lin,et al. End-to-End Open-Domain Question Answering with BERTserini , 2019, NAACL.

[7] Danqi Chen,et al. A Discrete Hard EM Approach for Weakly Supervised Question Answering , 2019, EMNLP.

[8] Charlotte Pasqual,et al. Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering , 2020, ArXiv.

[9] Petr Baudis,et al. Modeling of the Question Answering Task in the YodaQA System , 2015, CLEF.

[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[12] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[13] Christophe Gravier,et al. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples , 2018, LREC.

[14] Jennifer Chu-Carroll,et al. Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[15] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[17] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[18] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[19] Omer Levy,et al. Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[20] Colin Raffel,et al. How Much Knowledge Can You Pack Into the Parameters of a Language Model? , 2020, EMNLP.

[21] Omer Levy,et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[22] Jaewoo Kang,et al. Contextualized Sparse Representations for Real-Time Open-Domain Question Answering , 2020, ACL.

[23] Ali Farhadi,et al. Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension , 2018, EMNLP.

[24] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track Report , 1999, TREC.

[25] Ali Farhadi,et al. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index , 2019, ACL.

[26] Richard Socher,et al. Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.

[27] Matthew Henderson,et al. Efficient Natural Language Response Suggestion for Smart Reply , 2017, ArXiv.

[28] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[29] Matei Zaharia,et al. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT , 2020, SIGIR.

[30] Lucian Vlad Lita,et al. tRuEcasIng , 2003, ACL.

[31] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32] Kenton Lee,et al. Learning Recurrent Span Representations for Extractive Question Answering , 2016, ArXiv.

[33] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[34] Mark Andrew Greenwood,et al. Open-domain question answering , 2005 .

[35] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[36] Edouard Grave,et al. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[37] Jason Weston,et al. Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2020, ICLR.

[38] Ramesh Nallapati,et al. Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering , 2019, EMNLP.

[39] Andrew Chou,et al. Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.