Log-linear dialog manager

We design a log-linear probabilistic model for solving the dialog management task. In both planning and learning we optimize the same objective function: the expected reward. Rather than performing full policy optimization, we perform on-line estimation of the optimal action as a belief-propagation inference step. We employ context-free grammars to describe our variable spaces, which enables us to define rich features. To scale our approach to large variable spaces, we use particle belief propagation. Experiments show that the model is able to choose system actions that yield a high expected reward, outperforming its POMDP-like log-linear counterpart and a hand-crafted rule-based system.

[1]  Gary Geunbae Lee,et al.  Recent Approaches to Dialog Management for Spoken Dialog Systems , 2010, J. Comput. Sci. Eng..

[2]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[3]  Jason D. Williams,et al.  The best of both worlds: unifying conventional dialog systems and POMDPs , 2008, INTERSPEECH.

[4]  Jason D. Williams Incremental partition recombination for efficient tracking of multiple dialog states , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Jonathan Le Roux,et al.  Statistical Dialogue Management using Intention Dependency Graph , 2013, IJCNLP.

[6]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[7]  David A. McAllester,et al.  Particle Belief Propagation , 2009, AISTATS.

[8]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[9]  Taku Kudo,et al.  MeCab : Yet Another Part-of-Speech and Morphological Analyzer , 2005 .

[10]  Jesse Hoey,et al.  Solving POMDPs with Continuous or Large Discrete Observation Spaces , 2005, IJCAI.

[11]  Baining Guo,et al.  Spoken dialogue management as planning and acting under uncertainty , 2001, INTERSPEECH.

[12]  Nicholas Roy,et al.  Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.

[13]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[14]  Pascal Poupart,et al.  Factored partially observable Markov decision processes for dialogue management , 2005 .

[15]  Trung Bui,et al.  Practical Dialogue Manager Development using POMDPs , 2007, SIGdial.

[16]  Marc Toussaint,et al.  Chapter 1 Expectation-Maximization methods for solving ( PO ) MDPs and optimal control problems , 2009 .

[17]  Haihua Xu,et al.  Minimum tag error for discriminative training of conditional random fields , 2009, Inf. Sci..

[18]  Noah A. Smith,et al.  Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions , 2010, NAACL.

[19]  F. Rudzicz Human Language Technologies : The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2010 .

[20]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[21]  Steve Young,et al.  Optimisation for POMDP-Based Spoken Dialogue Systems , 2012 .