论文信息 - Zero-shot Slot Filling with DPR and RAG

Zero-shot Slot Filling with DPR and RAG

The ability to automatically extract Knowledge Graphs (KG) from a given collection of documents is a long-standing problem in Artificial Intelligence. One way to assess this capability is through the task of slot filling. Given an entity query in form of [ENTITY, SLOT, ?], a system is asked to ‘fill’ the slot by generating or extracting the missing value from a relevant passage or passages. This capability is crucial to create systems for automatic knowledge base population, which is becoming in ever-increasing demand, especially in enterprise applications. Recently, there has been a promising direction in evaluating language models in the same way we would evaluate knowledge bases, and the task of slot filling is the most suitable to this intent. The recent advancements in the field try to solve this task in an end-to-end fashion using retrieval-based language models. Models like Retrieval Augmented Generation (RAG) show surprisingly good performance without involving complex information extraction pipelines. However, the results achieved by these models on the two slot filling tasks in the KILT benchmark are still not at the level required by real-world information extraction systems. In this paper, we describe several strategies we adopted to improve the retriever and the generator of RAG in order to make it a better slot filler. Our KGI0 system1 reached the top-1 position on the KILT leaderboard on both T-REx and zsRE dataset with a large margin.

Alfio Gliozzo | Gaetano Rossiello | Michael Glass

[1] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[2] Luke Zettlemoyer,et al. Zero-shot Entity Linking with Dense Entity Retrieval , 2019, ArXiv.

[3] Christopher D. Manning,et al. Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[4] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[5] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[6] Dawn Song,et al. Language Models are Open Knowledge Graphs , 2020, ArXiv.

[7] Yury A. Malkov,et al. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[9] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[10] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[11] Christophe Gravier,et al. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples , 2018, LREC.