GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection

Pre-trained models have proved to be powerful in enhancing task-oriented dialog systems. However, current pre-training methods mainly focus on enhancing dialog understanding and generation tasks while neglecting the exploitation of dialog policy. In this paper, we propose GALAXY, a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora via semi-supervised learning. Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation with the help of unlabeled dialogs. We also implement a gating mechanism to weigh suitable unlabeled dialog samples. Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems, and achieves new state-of-the-art results on benchmark datasets: In-Car, MultiWOZ2.0 and MultiWOZ2.1, improving their end-to-end combined scores by 2.5, 5.3 and 5.5 points, respectively. We also show that GALAXY has a stronger few-shot ability than existing models under various low-resource settings. For reproducibility, we release the code and data at https://github.com/siat-nlp/GALAXY.

[1]  Elman Mansimov,et al.  Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System , 2021, ACL.

[2]  Hong Liu,et al.  Variational Latent-State GPT for Semi-Supervised Task-Oriented Dialog Systems , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Baolin Peng,et al.  Soloist: Building Task Bots at Scale with Transfer Learning and Machine Teaching , 2021, Transactions of the Association for Computational Linguistics.

[4]  Bill Byrne,et al.  Transferable Dialogue Systems and User Simulators , 2021, ACL.

[5]  Tao Qin,et al.  R-Drop: Regularized Dropout for Neural Networks , 2021, NeurIPS.

[6]  Jie Zhou,et al.  Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances , 2021, ACL.

[7]  Hai Zhao,et al.  Dialogue-oriented Pre-training , 2021, FINDINGS.

[8]  Yongbin Li,et al.  Preview, Attend and Review: Schema-Aware Curriculum Learning for Multi-Domain Dialogue State Tracking , 2021, ACL.

[9]  Danqi Chen,et al.  SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.

[10]  Zhou Yu,et al.  Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems , 2021, NAACL.

[11]  Qi Liu,et al.  Pretraining the Noisy Channel Model for Task-Oriented Dialogue , 2021, Transactions of the Association for Computational Linguistics.

[12]  Gary Geunbae Lee,et al.  Domain State Tracking for a Simplified Dialogue System , 2021, ArXiv.

[13]  Chengming Li,et al.  Multi-goal multi-agent learning for task-oriented dialogue with bidirectional teacher-student learning , 2020, Knowl. Based Syst..

[14]  Xiaojun Quan,et al.  UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2 , 2020, AAAI.

[15]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Carel van Niekerk,et al.  LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization , 2020, COLING.

[17]  Chengming Li,et al.  Amalgamating Knowledge from Two Teachers for Task-oriented Dialogue System with Adversarial Training , 2020, EMNLP.

[18]  Caiming Xiong,et al.  Probing Task-Oriented Dialogue Representation from Language Models , 2020, EMNLP.

[19]  Shikib Mehri,et al.  STAR: A Schema-Guided Dialog Dataset for Transfer Learning , 2020, ArXiv.

[20]  Dilek Z. Hakkani-Tür,et al.  DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue , 2020, ArXiv.

[21]  Pascale Fung,et al.  MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems , 2020, EMNLP.

[22]  Zhijian Ou,et al.  A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning , 2020, EMNLP.

[23]  Yunjie Gu,et al.  Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System , 2020, ICLR.

[24]  R. Socher,et al.  A Simple Language Model for Task-Oriented Dialogue , 2020, Neural Information Processing Systems.

[25]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[26]  Kai Wang,et al.  Multi-Domain Dialogue Acts and Response Co-Generation , 2020, ACL.

[27]  Chongruo Wu,et al.  PRAL: A Tailored Pre-Training Model for Task-Oriented Dialog Generation , 2020, ACL.

[28]  Zhijian Ou,et al.  Paraphrase Augmented Task-Oriented Dialog Generation , 2020, ACL.

[29]  Richard Socher,et al.  TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue , 2020, EMNLP.

[30]  Piji Li,et al.  An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation , 2020, ArXiv.

[31]  Jianfeng Gao,et al.  Few-shot Natural Language Generation for Task-Oriented Dialog , 2020, FINDINGS.

[32]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[33]  Zhijian Ou,et al.  Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context , 2019, AAAI.

[34]  Tsung-Hsien,et al.  ConveRT: Efficient and Accurate Conversational Representations from Transformers , 2019, FINDINGS.

[35]  Jianfeng Gao,et al.  DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation , 2019, ACL.

[36]  Hua Wu,et al.  PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable , 2019, ACL.

[37]  Dilek Z. Hakkani-Tür,et al.  Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations , 2019, INTERSPEECH.

[38]  Raghav Gupta,et al.  Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset , 2019, AAAI.

[39]  Bill Byrne,et al.  Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset , 2019, EMNLP.

[40]  Filip Radlinski,et al.  Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences , 2019, SIGdial.

[41]  Gökhan Tür,et al.  Flexibly-Structured Model for Task-Oriented Dialogues , 2019, SIGdial.

[42]  Mansi Gupta,et al.  AmazonQA: A Review-Based Question Answering Task , 2019, IJCAI.

[43]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[44]  Maxine Eskénazi,et al.  Structured Fusion Networks for Dialog , 2019, SIGdial.

[45]  Ivan Vulić,et al.  Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems , 2019, EMNLP.

[46]  Dilek Z. Hakkani-Tür,et al.  Towards Universal Dialogue Act Tagging for Task-Oriented Dialogues , 2019, INTERSPEECH.

[47]  Sungjin Lee,et al.  Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning Approach , 2019, SIGdial.

[48]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..

[49]  Tiancheng Zhao,et al.  Pretraining Methods for Dialog Context Representation Learning , 2019, ACL.

[50]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[51]  Yoshua Bengio,et al.  Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[52]  Bonnie L. Webber,et al.  Talking to myself: self-dialogues as data for conversational agents , 2018, ArXiv.

[53]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[54]  Zhaochun Ren,et al.  Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation , 2018, CIKM.

[55]  Jianfeng Gao,et al.  Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems , 2018, ArXiv.

[56]  Xiangnan He,et al.  Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.

[57]  Bing Liu,et al.  Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning , 2018, NAACL.

[58]  Dilek Z. Hakkani-Tür,et al.  Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems , 2018, NAACL.

[59]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[60]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[61]  Stefan Ultes,et al.  Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.

[62]  Christopher D. Manning,et al.  Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[63]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Hannes Schulz,et al.  Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[65]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[66]  Matthew Henderson,et al.  The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[67]  Cristian Danescu-Niculescu-Mizil,et al.  Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs , 2011, CMCL@ACL.

[68]  Kôiti Hasida,et al.  Towards an ISO Standard for Dialogue Act Annotation , 2010, LREC.

[69]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[70]  Jian-Yun Nie,et al.  An Investigation of Suitability of Pre-Trained Language Models for Dialogue Generation – Avoiding Discrepancies , 2021, FINDINGS.

[71]  Ondrej Dusek,et al.  AuGPT: Dialogue with Pre-trained Language Models and Data Augmentation , 2021, ArXiv.

[72]  Alex Polozov,et al.  SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing , 2021, ICLR.

[73]  Nancy Fulda,et al.  Conversational Scaffolding: An Analogy-based Approach to Response Prioritization in Open-domain Dialogs , 2020, ICAART.

[74]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[75]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[76]  Jinsung Yoon,et al.  GENERATIVE ADVERSARIAL NETS , 2018 .

[77]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[78]  Harry Bunt,et al.  The DIT++ taxanomy for functional dialogue markup , 2009 .