DLGNet-Task: An End-to-end Neural Network Framework for Modeling Multi-turn Multi-domain Task-Oriented Dialogue

Task oriented dialogue (TOD) requires the complex interleaving of a number of individually controllable components with strong guarantees for explainability and verifiability. This has made it difficult to adopt the multi-turn multi-domain dialogue generation capabilities of streamlined end-to-end open-domain dialogue systems. In this paper, we present a new framework, DLGNet-Task, a unified task-oriented dialogue system which employs autoregressive transformer networks such as DLGNet and GPT-2/3 to complete user tasks in multi-turn multi-domain conversations. Our framework enjoys the controllable, verifiable, and explainable outputs of modular approaches, and the low development, deployment and maintenance cost of end-to-end systems. Treating open-domain system components as additional TOD system modules allows DLGNet-Task to learn the joint distribution of the inputs and outputs of all the functional blocks of existing modular approaches such as, natural language understanding (NLU), state tracking, action policy, as well as natural language generation (NLG). Rather than training the modules individually, as is common in real-world systems, we trained them jointly with appropriate module separations. When evaluated on the MultiWOZ2.1 dataset, DLGNet-Task shows comparable performance to the existing state-of-the-art approaches. Furthermore, using DLGNet-Task in conversational AI systems reduces the level of effort required for developing, deploying, and maintaining intelligent assistants at scale.

[1]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[2]  Erik Mueller,et al.  DLGNet: A Transformer-based Model for Dialogue Response Generation , 2019, NLP4CONVAI.

[3]  Antoine Raux,et al.  The Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[4]  Tae-Yoon Kim,et al.  SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking , 2019, ACL.

[5]  Jianfeng Gao,et al.  SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model , 2020, ArXiv.

[6]  Ivan Vulić,et al.  Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems , 2019, EMNLP.

[7]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[8]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[9]  Sungjin Lee,et al.  ConvLab: Multi-Domain End-to-End Dialog System Platform , 2019, ACL.

[10]  Kee-Eung Kim,et al.  End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2 , 2020, ACL.

[11]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[12]  Pawel Budzianowski,et al.  Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing , 2018, ACL.

[13]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[14]  Wenhu Chen,et al.  Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention , 2019, ACL.

[15]  Richard Socher,et al.  A Simple Language Model for Task-Oriented Dialogue , 2020, NeurIPS.

[16]  Maxine Eskénazi,et al.  Structured Fusion Networks for Dialog , 2019, SIGdial.

[17]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[18]  Sungjin Lee,et al.  ONENET: Joint domain, intent, slot prediction for spoken language understanding , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[19]  Oluwatobi Olabiyi,et al.  An Adversarial Learning Framework For A Persona-Based Multi-Turn Dialogue Model , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Maxine Eskénazi,et al.  Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models , 2019, NAACL.

[22]  Erik T. Mueller,et al.  Multi-turn Dialogue Response Generation in an Adversarial Learning Framework , 2018, Proceedings of the First Workshop on NLP for Conversational AI.

[23]  Jiahuan Pei,et al.  A Modular Task-oriented Dialogue System Using a Neural Mixture-of-Experts , 2019, ArXiv.