Structured Fusion Networks for Dialog

Neural dialog models have exhibited strong performance, however their end-to-end nature lacks a representation of the explicit structure of dialog. This results in a loss of generalizability, controllability and a data-hungry nature. Conversely, more traditional dialog systems do have strong models of explicit structure. This paper introduces several approaches for explicitly incorporating structure into neural models of dialog. Structured Fusion Networks first learn neural dialog modules corresponding to the structured components of traditional dialog systems and then incorporate these modules in a higher-level generative model. Structured Fusion Networks obtain strong results on the MultiWOZ dataset, both with and without reinforcement learning. Structured Fusion Networks are shown to have several valuable properties, including better domain generalizability, improved performance in reduced data scenarios and robustness to divergence during reinforcement learning.

[1]  Yann Dauphin,et al.  Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Tiancheng Zhao,et al.  Pretraining Methods for Dialog Context Representation Learning , 2019, ACL.

[4]  Wenhu Chen,et al.  Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention , 2019, ACL.

[5]  Maxine Eskénazi,et al.  Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability , 2017, SIGDIAL Conference.

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Maxine Eskénazi,et al.  Zero-Shot Dialog Generation with Cross-Domain Latent Actions , 2018, SIGDIAL Conference.

[8]  Adam Coates,et al.  Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.

[9]  Razvan Pascanu,et al.  Understanding the exploding gradient problem , 2012, ArXiv.

[10]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[11]  Navdeep Jaitly,et al.  Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[12]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[15]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[16]  Alexander I. Rudnicky,et al.  The RavenClaw dialog management framework: Architecture and systems , 2009, Comput. Speech Lang..

[17]  Yu Zhang,et al.  Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment , 2018, ArXiv.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[20]  Bing Liu,et al.  Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks , 2016, SIGDIAL Conference.

[21]  Yoshua Bengio,et al.  On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[24]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[25]  Maxine Eskénazi,et al.  Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models , 2019, NAACL.

[26]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[27]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[28]  Yueting Zhuang,et al.  Diverse Image Captioning via GroupTalk , 2016, IJCAI.

[29]  Pascale Fung,et al.  End-to-End Recurrent Entity Network for Entity-Value Independent Goal-Oriented Dialog Learning , 2017 .

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[32]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .