Retrieval-based Goal-Oriented Dialogue Generation

Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue. In this work, we explore a hybrid approach to goal-oriented dialogue generation that combines retrieval from past history with a hierarchical, neural encoder-decoder architecture. We evaluate this approach in the customer support domain using the Multiwoz dataset (Budzianowski et al., 2018). We show that adding this retrieval step to a hierarchical, neural encoder-decoder architecture leads to significant improvements, including responses that are rated more appropriate and fluent by human evaluators. Finally, we compare our retrieval-based model to various semantically conditioned models explicitly using past dialog act information, and find that our proposed model is competitive with the current state of the art (Chen et al., 2019), while not requiring explicit labels about past machine acts.

[1]  Jianfeng Gao,et al.  End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.

[2]  David Vandyke,et al.  Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[3]  Jiliang Tang,et al.  A Survey on Dialogue Systems: Recent Advances and New Frontiers , 2017, SKDD.

[4]  Maxine Eskénazi,et al.  Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models , 2019, NAACL.

[5]  Jianfeng Gao,et al.  deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets , 2015, ACL.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Matthew Henderson,et al.  Machine Learning for Dialog State Tracking: A Review , 2015 .

[8]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[9]  Michael White,et al.  EXEMPLARS: A Practical, Extensible Framework For Dynamic Text Generation , 1998, INLG.

[10]  Maxine Eskénazi,et al.  Structured Fusion Networks for Dialog , 2019, SIGdial.

[11]  Wenhu Chen,et al.  Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention , 2019, ACL.

[12]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[13]  Graham Neubig,et al.  Dialogue State Tracking using Long Short Term Memory Neural Networks , 2015 .

[14]  Joelle Pineau,et al.  Hierarchical Neural Network Generative Models for Movie Dialogues , 2015, ArXiv.

[15]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[16]  Sandeep Subramanian,et al.  On Extractive and Abstractive Neural Document Summarization with Transformer Language Models , 2020, EMNLP.

[17]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[18]  Ryuichiro Higashinaka,et al.  Open-domain Utterance Generation for Conversational Dialogue Systems using Web-scale Dependency Structures , 2013, SIGDIAL Conference.

[19]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[20]  Masahiro Shibata,et al.  Dialog System for Open-Ended Conversation Using Web Documents , 2009, Informatica.

[21]  Maxine Eskénazi,et al.  Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning , 2016, SIGDIAL Conference.

[22]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[23]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[24]  Joachim Bingel,et al.  Domain Transfer in Dialogue Systems without Turn-Level Supervision , 2019, ArXiv.

[25]  Isabelle Augenstein,et al.  A strong baseline for question relevancy ranking , 2018, EMNLP.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Yonghong Yan,et al.  Dialog State Tracking using Conditional Random Fields , 2013, SIGDIAL Conference.

[28]  Matthew Henderson,et al.  Deep Neural Network Approach for the Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[29]  Lu Chen,et al.  Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues , 2016, INTERSPEECH.

[30]  Jason Weston,et al.  Retrieve and Refine: Improved Sequence Generation Models For Dialogue , 2018, SCAI@EMNLP.

[31]  Libo Qin,et al.  Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation , 2018, COLING.

[32]  Yang Zhao,et al.  A Conditional Variational Framework for Dialog Generation , 2017, ACL.

[33]  Matthew Henderson,et al.  Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.

[34]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[35]  Lu Chen,et al.  A generalized rule based tracker for dialogue state tracking , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[36]  Dilek Z. Hakkani-Tür,et al.  Scalable multi-domain dialogue state tracking , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[37]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[38]  Jun Xu,et al.  Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation , 2018, IJCAI.

[39]  Arpit Gupta,et al.  Scaling Multi-Domain Dialogue State Tracking via Query Reformulation , 2019, NAACL.

[40]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[41]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[42]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[43]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[44]  Xiang Li,et al.  Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems , 2016, ArXiv.

[45]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[46]  Lu Chen,et al.  Towards Universal Dialogue State Tracking , 2018, EMNLP.

[47]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48]  Mari Ostendorf,et al.  LSTM based Conversation Models , 2016, ArXiv.

[49]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[50]  Joelle Pineau,et al.  Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.

[51]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.