Multi-goal multi-agent learning for task-oriented dialogue with bidirectional teacher-student learning

Abstract In this work, we propose a multi-goal multi-agent learning (MGMA) framework for task-oriented dialogue generation, which aims to retrieve accurate entities from knowledge base (KB) and generate human-like responses simultaneously. Specifically, MGMA consists of a KB-oriented teacher agent for inquiring KB, a context-oriented teacher agent for extracting dialogue patterns, and a student agent that tries to not only retrieve accurate entities from KB but also generate human-like responses. A “two-to-one” teacher-student learning method is proposed to coordinate these three networks, training the student network to integrate the expert knowledge from the two teacher networks and achieve comprehensive performance in task-oriented dialogue generation. In addition, we also update the two teachers based on the output of the student network, since the space of possible responses is large and the teachers should adapt to the conversation style of the student. In this way, we can obtain more empathetic teachers with better performance. Moreover, in order to build each task-oriented dialogue system effectively, we employ a dialogue memory network to dynamically filter the irrelevant dialogue history and memorize important newly coming information. Another KB memory network, which shares the structural KB tuples throughout the whole conversation, is adopted to dynamically extract KB information with a memory pointer at each utterance. Extensive experiments on three benchmark datasets (i.e., CamRest, In-Car Assistant and Multi-WOZ 2.1) demonstrate that MGMA significantly outperforms baseline methods in terms of both automatic and human evaluation.

[1]  Dacheng Tao,et al.  Learning from Multiple Teacher Networks , 2017, KDD.

[2]  Christopher D. Manning,et al.  A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.

[3]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[4]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[5]  D. Vidhate,et al.  Cooperative Machine Learning with Information Fusion for Dynamic Decision Making in Diagnostic Applications , 2012, 2012 International Conference on Advances in Mobile Network, Communication and Its Applications.

[6]  Nikhil Gupta,et al.  Disentangling Language and Knowledge in Task-Oriented Dialogs , 2018, NAACL.

[7]  Jian Wang,et al.  Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems , 2020, COLING.

[8]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[9]  Chengming Li,et al.  Amalgamating Knowledge from Two Teachers for Task-oriented Dialogue System with Adversarial Training , 2020, EMNLP.

[10]  Devi Parikh,et al.  Cooperative Learning with Visual Attributes , 2017, ArXiv.

[11]  Nojun Kwak,et al.  FEED: Feature-level Ensemble for Knowledge Distillation , 2019, ArXiv.

[12]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[13]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[14]  Armand Joulin,et al.  Cooperative Learning of Disjoint Syntax and Semantics , 2019, NAACL.

[15]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[16]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.