论文信息 - A Unified Encoder-Decoder Framework with Entity Memory - 字舞流文

A Unified Encoder-Decoder Framework with Entity Memory

Entities, as important carriers of real-world knowledge, play a key role in many NLP tasks.We focus on incorporating entity knowledge into an encoder-decoder framework for informative text generation. Existing approaches tried to index, retrieve, and read external documents as evidence, but they suffered from a large computational overhead. In this work, we propose an encoder-decoder framework with an entity memory, namely EDMem. The entity knowledge is stored in the memory as latent representations, and the memory is pre-trained on Wikipedia along with encoder-decoder parameters. To precisely generate entity names, we design three decoding methods to constrain entity generation by linking entities in the memory. EDMem is a unified framework that can be used on various entity-intensive question answering and generation tasks. Extensive experimental results show that EDMem outperforms both memory-based auto-encoder models and non-memory encoder-decoder models.

Chenguang Zhu | W. Yu | Meng Jiang | Zhihan Zhang

[1] Michiel de Jong,et al. Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering , 2022, EACL.

[2] W. Yu,et al. A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods , 2022, EACL.

[3] Dan Iter,et al. Generate rather than Retrieve: Large Language Models are Strong Context Generators , 2022, ICLR.

[4] Sung Ju Hwang,et al. KALA: Knowledge-Augmented Language Model Adaptation , 2022, NAACL.

[5] Michiel de Jong,et al. MENTION MEMORY : INCORPORATING TEXTUAL KNOWLEDGE INTO TRANSFORMERS THROUGH ENTITY MENTION ATTENTION , 2022, ICLR.

[6] Shuohang Wang,et al. KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering , 2021, ACL.

[7] Xilun Chen,et al. UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering , 2020, NAACL-HLT.

[8] Zhiting Hu,et al. A Survey of Knowledge-enhanced Text Generation , 2020, ACM Comput. Surv..

[9] Eunsol Choi,et al. CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge , 2021, NeurIPS Datasets and Benchmarks.

[10] Dani Yogatama,et al. End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.

[11] Yue Zhang,et al. Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? , 2021, ACL.

[12] Haitian Sun,et al. Adaptable and Interpretable Neural MemoryOver Symbolic Knowledge , 2021, NAACL.

[13] William W. Cohen,et al. Reasoning Over Virtual Knowledge Bases With Open Predicate Relations , 2021, ICML.

[14] Danqi Chen,et al. Learning Dense Representations of Phrases at Scale , 2020, ACL.

[15] Nicola De Cao,et al. Autoregressive Entity Retrieval , 2020, ICLR.

[16] Tao Qin,et al. Knowledge-Aware Procedural Text Understanding with Multi-Stage Training , 2020, WWW.

[17] Nicola De Cao,et al. KILT: a Benchmark for Knowledge Intensive Language Tasks , 2020, NAACL.

[18] Sebastian Riedel,et al. Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets , 2020, EACL.

[19] Edouard Grave,et al. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[20] Madian Khabsa,et al. Studying Strategically: Learning to Mask for Closed-book QA , 2020, ArXiv.

[21] Wen-tau Yih,et al. Efficient One-Pass End-to-End Entity Linking for Questions , 2020, EMNLP.

[22] Eunsol Choi,et al. Entities as Experts: Sparse Memory Access with Entity Supervision , 2020, EMNLP.

[23] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[24] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[25] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[26] Jeffrey Ling,et al. Learning Cross-Context Entity Representations from Text , 2020, ArXiv.

[27] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[28] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[29] Zhiyi Yin,et al. DCA: Diversified Co-Attention towards Informative Live Video Commenting , 2020, NLPCC.

[30] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[31] Jason Weston,et al. ELI5: Long Form Question Answering , 2019, ACL.

[32] Yejin Choi,et al. COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[33] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[34] Jason Weston,et al. Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[35] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.

[36] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[37] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.

[38] Rahul Gupta,et al. SLING: A framework for frame semantic parsing , 2017, ArXiv.

[39] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[40] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[41] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[42] Jianfeng Gao,et al. A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[43] Andrew Chou,et al. Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[44] Wainfleet Servicing. THE REGIONAL MUNICIPALITY OF NIAGARA , 2009 .

[45] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[46] E. A. Beller. Commonwealth of Nations , 1952 .