Towards versatile conversations with data-driven dialog management and its integration in commercial platforms

Abstract Conversational interfaces have recently become a ubiquitous element in both the personal sphere by easing access to services, and industrial environments by the automation of services, improved customer support and its corresponding cost savings. However, designing the dialog model used by these interfaces to decide system responses is still a hard-to-accomplish task for complex conversational interactions. This paper describes a data-driven dialog management technique, which provides flexibility to develop, deploy and maintain this module. Various configurations for classification algorithms are assessed with two dialog corpora of different application domains, size, dimensionalities and set of possible system responses. The results of the evaluation show satisfactory accuracy and coherence rates in both tasks. As a proof of concept, our proposal has also been integrated with DialogFlow, a platform provided by Google to design conversational user interfaces. Our proposal has been assessed with a real use case, proving that it can be deployed in conjunction with commercial platforms, obtaining satisfactory results for the objective and subjective assessments completed.

[1]  David Griol,et al.  An empirical assessment of deep learning approaches to task-oriented dialog management , 2021, Neurocomputing.

[2]  Yorick Wilks,et al.  Some background on dialogue management and conversational speech for dialogue systems , 2011, Comput. Speech Lang..

[3]  David Vandyke,et al.  Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[4]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Yunming Ye,et al.  A memory network based end-to-end personalized task-oriented dialogue generation , 2020, Knowl. Based Syst..

[6]  Dilek Z. Hakkani-Tür,et al.  Interactive reinforcement learning for task-oriented dialogue management , 2016 .

[7]  Jason D. Williams,et al.  The best of both worlds: unifying conventional dialog systems and POMDPs , 2008, INTERSPEECH.

[8]  J.D. Williams,et al.  Scaling up POMDPs for Dialog Management: The ``Summary POMDP'' Method , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[9]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[10]  Ramón López-Cózar,et al.  A domain-independent statistical methodology for dialog management in spoken dialog systems , 2014, Comput. Speech Lang..

[11]  Gary Geunbae Lee,et al.  Recent Approaches to Dialog Management for Spoken Dialog Systems , 2010, J. Comput. Sci. Eng..

[12]  Srini Janarthanam,et al.  Hands-On Chatbots and Conversational UI Development: Build chatbots and voice user interfaces with Chatfuel, Dialogflow, Microsoft Bot Framework, Twilio, and Alexa Skills , 2017 .

[13]  David L. Roberts,et al.  Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.

[14]  Jason Williams A belief tracking challenge task for spoken dialog systems , 2012, SDCTD@NAACL-HLT.

[15]  Eric Horvitz,et al.  Conversation as Action Under Uncertainty , 2000, UAI.

[16]  Emiel Krahmer,et al.  Neural data-to-text generation: A comparison between pipeline and end-to-end architectures , 2019, EMNLP.

[17]  Erik Cambria,et al.  Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey , 2021, Artificial Intelligence Review.

[18]  David Griol,et al.  Discovering Dialog Rules by Means of an Evolutionary Approach , 2019, INTERSPEECH.

[19]  David Vandyke,et al.  Multi-domain Neural Network Language Generation for Spoken Dialogue Systems , 2016, NAACL.

[20]  Steve J. Young,et al.  Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..

[21]  Milica Gasic,et al.  Gaussian Processes for POMDP-Based Dialogue Manager Optimization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[23]  Gary Geunbae Lee,et al.  Example-based dialog modeling for practical multi-domain dialog system , 2009, Speech Commun..

[24]  Roberto Pieraccini,et al.  The use of belief networks for mixed-initiative dialog modeling , 2000, IEEE Trans. Speech Audio Process..

[25]  David Griol,et al.  A stochastic finite-state transducer approach to spoken dialog management , 2010, INTERSPEECH.

[26]  Iñigo Casanueva,et al.  Deep Learning for Conversational AI , 2018, NAACL.

[27]  Maxine Eskénazi,et al.  Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results , 2011, SIGDIAL Conference.

[28]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[29]  Satoshi Nakamura,et al.  Recent advances in WFST-based dialog system , 2009, INTERSPEECH.

[30]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[31]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[32]  Mihail Eric,et al.  MultiWOZ 2. , 2019 .

[33]  Matthew Henderson,et al.  The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[34]  Qun He,et al.  B&Anet: Combining bidirectional LSTM and self-attention for end-to-end learning of task-oriented dialogue system , 2020, Speech Commun..

[35]  Gökhan Tür,et al.  Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[36]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[37]  Verena Rieser,et al.  Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge , 2019, Comput. Speech Lang..

[38]  Oliver Lemon,et al.  Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.

[39]  Hongjie Shi,et al.  Convolutional Neural Networks for Multi-topic Dialog State Tracking , 2016, IWSDS.

[40]  Peter Young,et al.  Smart Reply: Automated Response Suggestion for Email , 2016, KDD.

[41]  Antoine Raux,et al.  The Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[42]  Oliver Lemon,et al.  Strategic Dialogue Management via Deep Reinforcement Learning , 2015, NIPS 2015.

[43]  Emilio Sanchis,et al.  Managing Unseen Situations in a Stochastic Dialog Model , 2006 .

[44]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[45]  Encarna Segarra,et al.  Multilingual Spoken Language Understanding using graphs and multiple translations , 2016, Comput. Speech Lang..

[46]  H. Cuayahuitl,et al.  Human-computer dialogue simulation using hidden Markov models , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[47]  Pierre Lison,et al.  OpenDial: A Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules , 2016, ACL.

[48]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[49]  E. Razumovskaia Incorporating rules into end-to-end dialog systems , 2019 .

[50]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[51]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[52]  David Griol,et al.  A stochastic approach for dialog management based on neural networks , 2006, INTERSPEECH.

[53]  Jorge Luis Victória Barbosa,et al.  Conversational agents in business: A systematic literature review and future research directions , 2020, Comput. Sci. Rev..

[54]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[55]  Matthew Henderson,et al.  The third Dialog State Tracking Challenge , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[56]  David Griol,et al.  A statistical approach to spoken dialog systems design and evaluation , 2008, Speech Commun..

[57]  Maxine Eskénazi,et al.  From rule-based to data-driven lexical entrainment models in spoken dialog systems , 2015, Comput. Speech Lang..

[58]  David Vandyke,et al.  Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems , 2015, SIGDIAL Conference.

[59]  Matthew Henderson,et al.  Robust dialog state tracking using delexicalised recurrent neural networks and unsupervised adaptation , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[60]  Oliver Lemon,et al.  Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU , 2019, SIGdial.

[61]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[62]  Stefan Ultes,et al.  Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.

[63]  Dilek Z. Hakkani-Tür,et al.  MA-DST: Multi-Attention Based Scalable Dialog State Tracking , 2020, AAAI.

[64]  Michael F. McTear,et al.  Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots , 2020, Conversational AI.

[65]  Matthew Henderson,et al.  Deep Neural Network Approach for the Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[66]  David Griol,et al.  The Conversational Interface: Talking to Smart Devices , 2016 .