A probabilistic framework for dialog simulation and optimal strategy learning

The design of Spoken Dialog Systems cannot be considered as the simple combination of speech processing technologies. Indeed, speech-based interface design has been an expert job for a long time. It necessitates good skills in speech technologies and low-level programming. Moreover, rapid development and reusability of previously designed systems remains uneasy. This makes optimality and objective evaluation of design very difficult. The design process is therefore a cyclic process composed of prototype releases, user satisfaction surveys, bug reports and refinements. It is well known that human intervention for testing is time-consuming and above all very expensive. This is one of the reasons for the recent interest in dialog simulation for evaluation as well as for design automation and optimization. In this paper we expose a probabilistic framework for a realistic simulation of spoken dialogs in which the major components of a dialog system are modeled and parameterized thanks to independent data or expert knowledge. Especially, an Automatic Speech Recognition (ASR) system model and a User Model (UM) have been developed. The ASR model, based on articulatory similarities in language models, provides task-adaptive performance prediction and Confidence Level (CL) distribution estimation. The user model relies on the Bayesian Networks (BN) paradigm and is used both for user behavior modeling and Natural Language Understanding (NLU) modeling. The complete simulation framework has been used to train a reinforcement-learning agent on two different tasks. These experiments helped to point out several potentially problematic dialog scenarios.

[1]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.

[2]  Lin-Shan Lee,et al.  Computer-aided analysis and design for spoken dialogue systems based on quantitative simulations , 2001, IEEE Trans. Speech Audio Process..

[3]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[4]  Steve Renals,et al.  Confidence measures for hybrid HMM/ANN speech recognition , 1997, EUROSPEECH.

[5]  Olivier Pietquin,et al.  ASR system modeling for automatic evaluation and optimization of dialogue systems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  F. Jelinek,et al.  Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .

[7]  Roberto Pieraccini,et al.  The use of belief networks for mixed-initiative dialog modeling , 2000, IEEE Trans. Speech Audio Process..

[8]  Ingrid Zukerman,et al.  # 2001 Kluwer Academic Publishers. Printed in the Netherlands. Predictive Statistical Models for User Modeling , 1999 .

[9]  R. Power The organisation of purposeful dialogues , 1979 .

[10]  Olivier Pietquin,et al.  A Framework for Unsupervised Learning of Dialogue Strategies , 2004 .

[11]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[12]  Eric Horvitz,et al.  The Lumière Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users , 1998, UAI.

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14]  A. Tversky Features of Similarity , 1977 .

[15]  Thierry Dutoit,et al.  Aided design of finite-state dialogue management systems , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[16]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[17]  Roberto Pieraccini,et al.  Learning dialogue strategies within the Markov decision process framework , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[18]  Peder A. Olsen,et al.  Theory and practice of acoustic confusability , 2002, Comput. Speech Lang..

[19]  Marilyn A. Walker,et al.  Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.

[20]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[21]  M. Eskenazi,et al.  The French language database: Defining, planning, and recording a large database , 1984, ICASSP.

[22]  Ramón López-Cózar,et al.  Assessment of dialogue systems by means of a new simulation technique , 2003, Speech Commun..

[23]  Steve Young,et al.  Simulation of human-machine dialogues , 1999 .

[24]  Marilyn A. Walker,et al.  Training a sentence planner for spoken dialogue using boosting , 2002, Comput. Speech Lang..