Knowledge Grounded Pre-Trained Model For Dialogue Response Generation

Teaching machine to answer arbitrary questions is a long-term goal of natural language processing. In real dialogue corpus, informative words like named entities can often be infrequent and hard to model, and one primary challenge of dialogue system is how to promote the model's capability of generating high-quality responses with those informative words. In order to address this problem, we propose a novel pre-training based encoder-decoder model, which can enhance the multiturn dialogue response generation by incorporating external textual knowledge. We adopt BERT as encoder to merge external knowledge into dialogue history modeling, and a multi-head attention based decoder is designed to incorporate the semantic information from both knowledge and dialogue hidden representations into decoding process to generate informative and proper dialogue responses. Experiments on two response generation tasks indicate our model to be superior over competitive baselines on both automatic and human evaluations.

[1]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[2]  Yang Feng,et al.  Incremental Transformer with Deliberation Decoder for Document Grounded Conversations , 2019, ACL.

[3]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[4]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[5]  Nan Jiang,et al.  LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics , 2018, NAACL.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.

[8]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[9]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[10]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[11]  Jun Zhao,et al.  Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning , 2017, ACL.

[12]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[13]  Ji Wang,et al.  Pretraining-Based Natural Language Generation for Text Summarization , 2019, CoNLL.

[14]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[15]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[16]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[17]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[18]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[19]  Rodney D. Nielsen,et al.  Linguistic Considerations in Automatic Question Generation , 2014, ACL.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[22]  Zhang Xiong,et al.  Augmenting Dialogue Response Generation With Unstructured Textual Knowledge , 2019, IEEE Access.

[23]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[24]  Alan W. Black,et al.  A Dataset for Document Grounded Conversations , 2018, EMNLP.

[25]  Yik-Cheung Tam,et al.  Cluster-based beam search for pointer-generator chatbot grounded by knowledge , 2020, Comput. Speech Lang..

[26]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[27]  Xiaoyan Zhu,et al.  Commonsense Knowledge Aware Conversation Generation with Graph Attention , 2018, IJCAI.

[28]  Xiaoyu Wang,et al.  Exploring Personalized Neural Conversational Models , 2017, IJCAI.

[29]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[30]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[31]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.