Context-aware Memory Enhanced Transformer for End-to-end Task-Oriented Dialogue Systems

Recent studies try to build task-oriented dialogue systems in an end-to-end manner and the existing works make great progress on this task. However, there is still an issue need to be further considered, i.e., how to effectively represent the knowledge bases and incorporate that into dialogue systems. To solve this issue, we design a novel Contextaware Memory Generation module to model the knowledge bases, which can generate context-aware entity representations with perceiving relevant entities. Furthermore, we incorporate this module into Transformer and propose Context-aware Memory Enhanced Transformer (CMET), which can aggregate information from the dialogue history and knowledge bases to generate better responses. Through extensive experiments, our method can achieve superior performance over the state-of-the-art methods.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[3]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[4]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[5]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Christopher D. Manning,et al.  Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[10]  Christopher D. Manning,et al.  A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.

[11]  Min-Yen Kan,et al.  Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.

[12]  Pascale Fung,et al.  Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems , 2018, ACL.

[13]  Libo Qin,et al.  Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation , 2018, COLING.

[14]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[15]  Yangming Li,et al.  Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever , 2019, EMNLP.

[16]  Danish Contractor,et al.  2019 Formatting Instructions for Authors Using LaTeX , 2018 .

[17]  Richard Socher,et al.  Global-to-local Memory Pointer Networks for Task-Oriented Dialogue , 2019, ICLR.

[18]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[19]  Meina Song,et al.  KB-Transformer: Incorporating Knowledge into End-to-End Task-Oriented Dialog Systems , 2019, 2019 15th International Conference on Semantics, Knowledge and Grids (SKG).

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Yan Xu,et al.  Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems , 2020, FINDINGS.

[22]  Xiao Xu,et al.  Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog , 2020, ACL.

[23]  Yejin Bang,et al.  The Adapter-Bot: All-In-One Controllable Conversational Model , 2020, AAAI.