DukeNet: A Dual Knowledge Interaction Network for Knowledge-Grounded Conversation

Today's conversational agents often generate responses that not sufficiently informative. One way of making them more informative is through the use of of external knowledge sources with so-called Knowledge-Grounded Conversations (KGCs). In this paper, we target the Knowledge Selection (KS) task, a key ingredient in KGC, that is aimed at selecting the appropriate knowledge to be used in the next response. Existing approaches to Knowledge Selection (KS) based on learned representations of the conversation context, that is previous conversation turns, and use Maximum Likelihood Estimation (MLE) to optimize KS. Such approaches have two main limitations. First, they do not explicitly track what knowledge has been used in the conversation nor how topics have shifted during the conversation. Second, MLE often relies on a limited set of example conversations for training, from which it is hard to infer that facts retrieved from the knowledge source can be re-used in multiple conversation contexts, and vice versa. We propose Dual Knowledge Interaction Network (DukeNet), a framework to address these challenges. DukeNet explicitly models knowledge tracking and knowledge shifting as dual tasks. We also design Dual Knowledge Interaction Learning (DukeL), an unsupervised learning scheme to train DukeNet by facilitating interactions between knowledge tracking and knowledge shifting, which, in turn, enables DukeNet to explore extra knowledge besides the knowledge encountered in the training set. This dual process also allows us to define rewards that help us to optimize both knowledge tracking and knowledge shifting. Experimental results on two public KGC benchmarks show that DukeNet significantly outperforms state-of-the-art methods in terms of both automatic and human evaluations, indicating that DukeNet enhanced by DukeL can select more appropriate knowledge and hence generate more informative and engaging responses.

[1]  Fumin Shen,et al.  Chat More: Deepening and Widening the Chatting Topic via A Deep Model , 2018, SIGIR.

[2]  Rongzhong Lian,et al.  Learning to Select Knowledge for Response Generation in Dialog Systems , 2019, IJCAI.

[3]  Tao Qin,et al.  Joint Learning of Question Answering and Question Generation , 2020, IEEE Transactions on Knowledge and Data Engineering.

[4]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[5]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[6]  Nenghai Yu,et al.  Model-Level Dual Learning , 2018, ICML.

[7]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[8]  Alan W. Black,et al.  A Dataset for Document Grounded Conversations , 2018, EMNLP.

[9]  Junji Tomita,et al.  Multi-style Generative Reading Comprehension , 2019, ACL.

[10]  Xiangnan He,et al.  Interactive Path Reasoning on Graph for Conversational Recommendation , 2020, KDD.

[11]  Hung-yi Lee,et al.  DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs , 2019, EMNLP.

[12]  Tao Qin,et al.  Question Answering and Question Generation as Dual Tasks , 2017, ArXiv.

[13]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[14]  Nenghai Yu,et al.  Dual Supervised Learning , 2017, ICML.

[15]  Mitesh M. Khapra,et al.  Towards Exploiting Background Knowledge for Building Conversation Systems , 2018, EMNLP.

[16]  Yang Feng,et al.  Incremental Transformer with Deliberation Decoder for Document Grounded Conversations , 2019, ACL.

[17]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[18]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[19]  Jianfeng Gao,et al.  Challenges in Building Intelligent Open-domain Dialog Systems , 2019, ACM Trans. Inf. Syst..

[20]  Byeongchang Kim,et al.  Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue , 2020, ICLR.

[21]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[22]  Li Zhao,et al.  Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization , 2018, AAAI.

[23]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.

[24]  Tie-Yan Liu,et al.  Multi-Agent Dual Learning , 2019, ICLR.

[25]  Yang Feng,et al.  Knowledge Diffusion for Neural Dialogue Generation , 2018, ACL.

[26]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[27]  Xiaodong Liu,et al.  Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading , 2019, ACL.

[28]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29]  Di Jiang,et al.  DAL: Dual Adversarial Learning for Dialogue Generation , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Sunita Sarawagi,et al.  Posterior Attention Models for Sequence to Sequence Learning , 2018, ICLR.

[32]  Min-Yen Kan,et al.  Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.

[33]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[34]  Zhaochun Ren,et al.  Hierarchical Variational Memory Network for Dialogue Generation , 2018, WWW.

[35]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[36]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Nenghai Yu,et al.  Dual Inference for Machine Learning , 2017, IJCAI.

[38]  M. de Rijke,et al.  Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation , 2019, AAAI.

[39]  Jie Zhou,et al.  A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer , 2019, IJCAI.

[40]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[41]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[42]  W. Bruce Croft,et al.  Asking Clarifying Questions in Open-Domain Information-Seeking Conversations , 2019, SIGIR.

[43]  Sunita Sarawagi,et al.  Surprisingly Easy Hard-Attention for Sequence to Sequence Learning , 2018, EMNLP.

[44]  Zhaochun Ren,et al.  Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation , 2018, CIKM.

[45]  Joelle Pineau,et al.  Extending Neural Generative Conversational Model using External Knowledge Sources , 2018, EMNLP.

[46]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[47]  Zheng-Yu Niu,et al.  Knowledge Aware Conversation Generation with Reasoning on Augmented Graph , 2019, ArXiv.

[48]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[49]  Zhaopeng Tu,et al.  EmpGAN: Multi-resolution Interactive Empathetic Dialogue Generation , 2019, ArXiv.

[50]  Wen Zheng,et al.  Enhancing Conversational Dialogue Models with Grounded Knowledge , 2019, CIKM.

[51]  M. de Rijke,et al.  RefNet: A Reference-aware Network for Background Based Conversation , 2019, AAAI.

[52]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[53]  Jun Xu,et al.  Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation , 2018, IJCAI.

[54]  Xiaoyan Zhu,et al.  Commonsense Knowledge Aware Conversation Generation with Graph Attention , 2018, IJCAI.