Policy committee for adaptation in multi-domain spoken dialogue systems

Moving from limited-domain dialogue systems to open domain dialogue systems raises a number of challenges. One of them is the ability of the system to utilise small amounts of data from disparate domains to build a dialogue manager policy. Previous work has focused on using data from different domains to adapt a generic policy to a specific domain. Inspired by Bayesian committee machines, this paper proposes the use of a committee of dialogue policies. The results show that such a model is particularly beneficial for adaptation in multi-domain dialogue systems. The use of this model significantly improves performance compared to a single policy baseline, as confirmed by the performed real-user trial. This is the first time a dialogue policy has been trained on multiple domains on-line in interaction with real users.

[1]  Shie Mannor,et al.  Reinforcement learning with Gaussian processes , 2005, ICML.

[2]  Gökhan Tür,et al.  Leveraging knowledge graphs for web-scale unsupervised semantic parsing , 2013, INTERSPEECH.

[3]  Oliver Lemon,et al.  Evaluation of a hierarchical reinforcement learning spoken dialogue system , 2010, Comput. Speech Lang..

[4]  David Vandyke,et al.  Learning from real users: rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems , 2015, INTERSPEECH.

[5]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6]  Pter Szeredi,et al.  The Semantic Web Explained: The Technology and Mathematics behind Web 3.0 , 2014 .

[7]  Matthieu Geist,et al.  Managing uncertainty within the ktd framework , 2011 .

[8]  Gökhan Tür,et al.  Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing , 2012, INTERSPEECH.

[9]  Hao Tian,et al.  Policy Learning for Domain Selection in an Extensible Multi-domain Spoken Dialogue System , 2014, EMNLP.

[10]  Ruhi Sarikaya,et al.  Knowledge Graph Inference for spoken dialog systems , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[12]  Dongho Kim,et al.  Distributed dialogue policies for multi-domain statistical dialogue management , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Milica Gasic,et al.  Gaussian Processes for POMDP-Based Dialogue Manager Optimization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[14]  Jason D. Williams,et al.  The best of both worlds: unifying conventional dialog systems and POMDPs , 2008, INTERSPEECH.

[15]  Pierre Lison,et al.  Multi-Policy Dialogue Management , 2011, SIGDIAL Conference.

[16]  Fabrice Lefèvre,et al.  Back-off action selection in summary space-based POMDP dialogue systems , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[17]  Steve J. Young,et al.  Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..

[18]  Milica Gasic,et al.  Parameter estimation for agenda-based user simulation , 2010, SIGDIAL Conference.

[19]  Hui Ye,et al.  Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System , 2007, NAACL.

[20]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[21]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..