Adaptive categorical understanding for spoken dialogue systems

In this paper, the speech understanding problem in the context of a spoken dialogue system is formalized in a maximum likelihood framework. Off-line adaptation of stochastic language models that interpolate dialogue state specific and general application-level language models is proposed. Word and dialogue-state n-grams are used for building categorical understanding and dialogue models, respectively. Acoustic confidence scores are incorporated in the understanding formulation. Problems due to data sparseness and out-of-vocabulary words are discussed. The performance of the speech recognition and understanding language models are evaluated with the "Carmen Sandiego" multimodal computer game corpus. Incorporating dialogue models reduces relative understanding error rate by 15%-25%, while acoustic confidence scores achieve a further 10% error reduction for this computer gaming application.

[1]  Salim Roukos,et al.  Language model adaptation via minimum discrimination information , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[3]  Stefan Besling,et al.  Language model speaker adaptation , 1995, EUROSPEECH.

[4]  Simon King,et al.  Using prosodic information to constrain language models for spoken dialogue , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  Roberto Pieraccini,et al.  Non-deterministic stochastic language models for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Shrikanth S. Narayanan,et al.  Creating conversational interfaces for children , 2002, IEEE Trans. Speech Audio Process..

[7]  Shrikanth S. Narayanan,et al.  Automatic speech recognition for children , 1997, EUROSPEECH.

[8]  Jerome R. Bellegarda,et al.  Toward unconstrained command and control: data-driven semantic inference , 2000, INTERSPEECH.

[9]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[10]  Dietrich Klakow,et al.  Language model adaptation using dynamic marginals , 1997, EUROSPEECH.

[11]  Hiroyuki Sakamoto,et al.  Continuous speech recognition using a dialog-conditioned stochastic language model , 1994, ICSLP.

[12]  Paolo Baggia,et al.  Specialized language models using dialogue predictions , 1996, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Shrikanth S. Narayanan,et al.  Robust recognition of children's speech , 2003, IEEE Trans. Speech Audio Process..

[14]  Bob Carpenter,et al.  Natural language call routing: a robust, self-organizing approach , 1998, ICSLP.

[15]  Steve Young,et al.  The statistical approach to the design of spoken dialogue systems , 2003 .

[16]  Wolfgang Minker Stochastically-based natural language understanding across tasks and languages , 1997, EUROSPEECH.

[17]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[18]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[19]  Giuseppe Riccardi,et al.  Integration of utterance verification with statistical language modeling and spoken language understanding , 2001, Speech Commun..

[20]  Marcello Federico,et al.  Bayesian estimation methods for n-gram language model adaptation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[21]  Shrikanth S. Narayanan,et al.  Spoken dialog systems for children , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[22]  Wolfgang Reichl Language model adaptation using minimum discrimination information , 1999, EUROSPEECH.

[23]  Dan Jurafsky,et al.  Dialog Act Modeling for Conversational Speech , 1998 .

[24]  Giuseppe Riccardi,et al.  Stochastic language adaptation over time and state in natural spoken dialog systems , 2000, IEEE Trans. Speech Audio Process..

[25]  Salim Roukos,et al.  MDI adaptation of language models across corpora , 1997, EUROSPEECH.

[26]  A. Stolcke,et al.  Dialog act modelling for conversational speech , 1998 .