Knowledge-Grounded Dialogue Generation with Pre-trained Language Models

We study knowledge-grounded dialogue generation with pre-trained language models. To leverage the redundant external knowledge under capacity constraint, we propose equipping response generation defined by a pre-trained language model with a knowledge selection module, and an unsupervised approach to jointly optimizing knowledge selection and response generation with unlabeled dialogues. Empirical results on two benchmarks indicate that our model can significantly outperform state-of-the-art methods in both automatic evaluation and human judgment.

[1]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[2]  Yang Feng,et al.  Incremental Transformer with Deliberation Decoder for Document Grounded Conversations , 2019, ACL.

[3]  Xueqi Cheng,et al.  ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation , 2019, ACL.

[4]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[5]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[6]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[7]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[8]  Luyao Huang,et al.  Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence , 2019, NAACL.

[9]  Byeongchang Kim,et al.  Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue , 2020, ICLR.

[10]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[11]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[12]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[13]  Minlie Huang,et al.  Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders , 2018, ACL.

[14]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[15]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[16]  M. de Rijke,et al.  Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation , 2019, AAAI.

[17]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[18]  Cordelia Schmid,et al.  VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Dongyan Zhao,et al.  Low-Resource Knowledge-Grounded Dialogue Generation , 2020, ICLR.

[20]  Wei Wu,et al.  Zero-Resource Knowledge-Grounded Dialogue Generation , 2020, NeurIPS.

[21]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[22]  Rongzhong Lian,et al.  Learning to Select Knowledge for Response Generation in Dialog Systems , 2019, IJCAI.

[23]  Stefan Lee,et al.  ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.

[24]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[25]  Xu Tan,et al.  MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.

[26]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[27]  Ming Zhou,et al.  HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization , 2019, ACL.

[28]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[29]  Xueqi Cheng,et al.  Learning to Control the Specificity in Neural Response Generation , 2018, ACL.

[30]  Xiaoyan Zhu,et al.  Commonsense Knowledge Aware Conversation Generation with Graph Attention , 2018, IJCAI.

[31]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[32]  Daniel McDuff,et al.  Emotional Dialogue Generation using Image-Grounded Language Models , 2018, CHI.

[33]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[34]  Ying Wang,et al.  Neural Response Generation with Meta-words , 2019, ACL.

[35]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[36]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[37]  Dilek Z. Hakkani-Tür,et al.  Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations , 2019, INTERSPEECH.

[38]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[39]  Zhiyuan Liu,et al.  Understanding the Behaviors of BERT in Ranking , 2019, ArXiv.

[40]  Jianfeng Gao,et al.  Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation , 2017, IJCNLP.

[41]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[42]  Wei-Ying Ma,et al.  Hierarchical Recurrent Attention Network for Response Generation , 2017, AAAI.

[43]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[44]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[45]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[46]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[47]  Alan W. Black,et al.  A Dataset for Document Grounded Conversations , 2018, EMNLP.

[48]  Dongyan Zhao,et al.  Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism , 2018, IJCAI.

[49]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[50]  Seungwhan Moon,et al.  OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs , 2019, ACL.

[51]  Jin-Hyuk Hong,et al.  Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information , 2018, AAAI.

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  Jason Weston,et al.  What makes a good conversation? How controllable attributes affect human judgments , 2019, NAACL.

[54]  Hung-yi Lee,et al.  DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs , 2019, EMNLP.

[55]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[56]  Christopher D. Manning,et al.  Deep Reinforcement Learning for Mention-Ranking Coreference Models , 2016, EMNLP.

[57]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[58]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[59]  Furu Wei,et al.  VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.