Discriminative state tracking for spoken dialog systems

In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. Current approaches based on generative or discriminative models have different but important shortcomings that limit their accuracy. In this paper we discuss these limitations and introduce a new approach for discriminative state tracking that overcomes them by leveraging the problem structure. An offline evaluation with dialog data collected from real users shows improvements in both state tracking accuracy and the quality of the posterior probabilities. Features that encode speech recognition error patterns are particularly helpful, and training requires relatively few dialogs.

[1]  Jason D. Williams Challenges and Opportunities for State Tracking in Statistical Spoken Dialog Systems: Results From Two Public Deployments , 2012, IEEE Journal of Selected Topics in Signal Processing.

[2]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[3]  Jason D. Williams,et al.  Demonstration of AT&T “Let's Go”: A production-grade statistical spoken dialog system , 2010, 2010 IEEE Spoken Language Technology Workshop.

[4]  Milica Gasic,et al.  Effective handling of dialogue state in the hidden information state POMDP-based dialogue manager , 2011, TSLP.

[5]  Maxine Eskénazi,et al.  Spoken Dialog Challenge 2010 , 2010, 2010 IEEE Spoken Language Technology Workshop.

[6]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[7]  Jason D. Williams Incremental partition recombination for efficient tracking of multiple dialog states , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Jason D. Williams,et al.  Estimating Probability of Correctness for ASR N-Best Lists , 2009, SIGDIAL Conference.

[9]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[10]  Steve J. Young,et al.  Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..

[11]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[12]  Alexander I. Rudnicky,et al.  A “K Hypotheses + Other” Belief Updating Model , 2006 .

[13]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[14]  Eric Horvitz,et al.  A computational architecture for conversation , 1999 .