MTSS: Learn from Multiple Domain Teachers and Become a Multi-domain Dialogue Expert

How to build a high-quality multi-domain dialogue system is a challenging work due to its complicated and entangled dialogue state space among each domain, which seriously limits the quality of dialogue policy, and further affects the generated response. In this paper, we propose a novel method to acquire a satisfying policy and subtly circumvent the knotty dialogue state representation problem in the multi-domain setting. Inspired by real school teaching scenarios, our method is composed of multiple domain-specific teachers and a universal student. Each individual teacher only focuses on one specific domain and learns its corresponding domain knowledge and dialogue policy based on a precisely extracted single domain dialogue state representation. Then, these domain-specific teachers impart their domain knowledge and policies to a universal student model and collectively make this student model a multi-domain dialogue expert. Experiment results show that our method reaches competitive results with SOTAs in both multi-domain and single domain setting.

[1]  David Vandyke,et al.  PyDial: A Multi-domain Statistical Dialogue System Toolkit , 2017, ACL.

[2]  Lu Chen,et al.  A generalized rule based tracker for dialogue state tracking , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Botond Pakucs Towards dynamic multi-domain dialogue processing , 2003, INTERSPEECH.

[5]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[6]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[7]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[8]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[9]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[10]  David Vandyke,et al.  Multi-domain Neural Network Language Generation for Spoken Dialogue Systems , 2016, NAACL.

[11]  Maxine Eskénazi,et al.  Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models , 2019, NAACL.

[12]  Di He,et al.  Multilingual Neural Machine Translation with Knowledge Distillation , 2019, ICLR.

[13]  Wenhu Chen,et al.  Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention , 2019, ACL.

[14]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[15]  Dilek Z. Hakkani-Tür,et al.  Scalable multi-domain dialogue state tracking , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[16]  Alexander M. Rush,et al.  Sequence-Level Knowledge Distillation , 2016, EMNLP.

[17]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[18]  Dilek Z. Hakkani-Tür,et al.  Flexible and Scalable State Tracking Framework for Goal-Oriented Dialogue Systems , 2018, ArXiv.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  매스스테판에이치.,et al.  System and method for providing network coordinated conversational services , 1999 .

[21]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23]  Richard Socher,et al.  Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems , 2019, ACL.

[24]  Richard Socher,et al.  Global-Locally Self-Attentive Encoder for Dialogue State Tracking , 2018, ACL.

[25]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[27]  Pawel Budzianowski,et al.  Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing , 2018, ACL.

[28]  Geoffrey Zweig,et al.  Attention with Intention for a Neural Network Conversation Model , 2015, ArXiv.