A Unified Encoder-Decoder Framework with Entity Memory

Entities, as important carriers of real-world knowledge, play a key role in many NLP tasks.We focus on incorporating entity knowledge into an encoder-decoder framework for informative text generation. Existing approaches tried to index, retrieve, and read external documents as evidence, but they suffered from a large computational overhead. In this work, we propose an encoder-decoder framework with an entity memory, namely EDMem. The entity knowledge is stored in the memory as latent representations, and the memory is pre-trained on Wikipedia along with encoder-decoder parameters. To precisely generate entity names, we design three decoding methods to constrain entity generation by linking entities in the memory. EDMem is a unified framework that can be used on various entity-intensive question answering and generation tasks. Extensive experimental results show that EDMem outperforms both memory-based auto-encoder models and non-memory encoder-decoder models.

[1]  Michiel de Jong,et al.  Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering , 2022, EACL.

[2]  W. Yu,et al.  A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods , 2022, EACL.

[3]  Dan Iter,et al.  Generate rather than Retrieve: Large Language Models are Strong Context Generators , 2022, ICLR.

[4]  Sung Ju Hwang,et al.  KALA: Knowledge-Augmented Language Model Adaptation , 2022, NAACL.

[5]  Michiel de Jong,et al.  Mention Memory: incorporating textual knowledge into Transformers through entity mention attention , 2021, ICLR.

[6]  Shuohang Wang,et al.  KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering , 2021, ACL.

[7]  Xilun Chen,et al.  UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering , 2020, NAACL-HLT.

[8]  Zhiting Hu,et al.  A Survey of Knowledge-enhanced Text Generation , 2020, ACM Comput. Surv..

[9]  Eunsol Choi,et al.  CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge , 2021, NeurIPS Datasets and Benchmarks.

[10]  Dani Yogatama,et al.  End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.

[11]  Yue Zhang,et al.  Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? , 2021, ACL.

[12]  Haitian Sun,et al.  Adaptable and Interpretable Neural MemoryOver Symbolic Knowledge , 2021, NAACL.

[13]  William W. Cohen,et al.  Reasoning Over Virtual Knowledge Bases With Open Predicate Relations , 2021, ICML.

[14]  Danqi Chen,et al.  Learning Dense Representations of Phrases at Scale , 2020, ACL.

[15]  Nicola De Cao,et al.  Autoregressive Entity Retrieval , 2020, ICLR.

[16]  Tao Qin,et al.  Knowledge-Aware Procedural Text Understanding with Multi-Stage Training , 2020, WWW.

[17]  Nicola De Cao,et al.  KILT: a Benchmark for Knowledge Intensive Language Tasks , 2020, NAACL.

[18]  Sebastian Riedel,et al.  Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets , 2020, EACL.

[19]  Edouard Grave,et al.  Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[20]  Madian Khabsa,et al.  Studying Strategically: Learning to Mask for Closed-book QA , 2020, ArXiv.

[21]  Wen-tau Yih,et al.  Efficient One-Pass End-to-End Entity Linking for Questions , 2020, EMNLP.

[22]  Eunsol Choi,et al.  Entities as Experts: Sparse Memory Access with Entity Supervision , 2020, EMNLP.

[23]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[24]  Ming-Wei Chang,et al.  REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[25]  Colin Raffel,et al.  How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[26]  Jeffrey Ling,et al.  Learning Cross-Context Entity Representations from Text , 2020, ArXiv.

[27]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[28]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[29]  Zhiyi Yin,et al.  DCA: Diversified Co-Attention towards Informative Live Video Commenting , 2020, NLPCC.

[30]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[31]  Jason Weston,et al.  ELI5: Long Form Question Answering , 2019, ACL.

[32]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[33]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[34]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[35]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[36]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[37]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[38]  Rahul Gupta,et al.  SLING: A framework for frame semantic parsing , 2017, ArXiv.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[41]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[42]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[43]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[44]  Wainfleet Servicing THE REGIONAL MUNICIPALITY OF NIAGARA , 2009 .

[45]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[46]  E. A. Beller Commonwealth of Nations , 1952 .