Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation

Although pre-training models have achieved great success in dialogue generation, their performance drops dramatically when the input contains an entity that does not appear in pretraining and fine-tuning datasets (unseen entity). To address this issue, existing methods leverage an external knowledge base to generate appropriate responses. In real-world scenario, the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval. To deal with this problem, instead of introducing knowledge base as the input, we force the model to learn a better semantic representation by predicting the information in the knowledge base, only based on the input context. Specifically, with the help of a knowledge base, we introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context. Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.

[1]  Yoav Shoham,et al.  SenseBERT: Driving Some Sense into BERT , 2019, ACL.

[2]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[3]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[4]  Mary Williamson,et al.  Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills , 2020, ACL.

[5]  Dongyan Zhao,et al.  Low-Resource Knowledge-Grounded Dialogue Generation , 2020, ICLR.

[6]  Rongzhong Lian,et al.  Learning to Select Knowledge for Response Generation in Dialog Systems , 2019, IJCAI.

[7]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[8]  Meredith Ringel Morris,et al.  "With most of it being pictures now, I rarely use it": Understanding Twitter's Evolving Accessibility to Blind Users , 2016, CHI.

[9]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[10]  Yu Sun,et al.  ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.

[11]  Chen Xing,et al.  Taking Notes on the Fly Helps Language Pre-Training , 2021, ICLR.

[12]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[13]  Hiroyuki Shindo,et al.  LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention , 2020, EMNLP.

[14]  Byeongchang Kim,et al.  Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue , 2020, ICLR.

[15]  Roy Schwartz,et al.  Knowledge Enhanced Contextual Word Representations , 2019, EMNLP/IJCNLP.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[18]  Zhoujun Li,et al.  Response Generation by Context-aware Prototype Editing , 2018, AAAI.

[19]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[20]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[21]  Jie Zhou,et al.  Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation , 2020, EMNLP.

[22]  Nanyun Peng,et al.  Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[23]  Yang Feng,et al.  Incremental Transformer with Deliberation Decoder for Document Grounded Conversations , 2019, ACL.

[24]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.