论文信息 - Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA

Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA

Within the framework of the French evaluation program MEDIA on spoken dialogue systems, this paper presents the methods proposed at the LIA for the robust extraction of basic conceptual constituents (or concepts) from an audio message. The conceptual decoding model proposed follows a stochastic paradigm and is directly integrated into the Automatic Speech Recognition (ASR) process. This approach allows us to keep the probabilistic search space on sequences of words produced by the ASR module and to project it to a probabilistic search space of sequences of concepts. This paper presents the first ASR results on the French spoken dialogue corpus MEDIA, available through ELDA. The experiments made on this corpus show that the performance reached by our approach is better than the traditional sequential approach that looks first for the best sequence of words before looking for the best sequence of concepts. Index Terms: Automatic Speech Recognition, Spoken Dialogue, Spoken Language Understanding.

Frédéric Béchet | Pascal Nocera | Christophe Servan | Christian Raymond

[1] Sophie Rosset,et al. Semantic annotation of the French media dialog corpus , 2005, INTERSPEECH.

[2] Ye-Yi Wang,et al. Is word error rate a good indicator for spoken language understanding accuracy , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[3] Frédéric Béchet,et al. On the use of finite state transducers for semantic interpretation , 2006, Speech Commun..

[4] Giuseppe Riccardi,et al. Stochastic language models for speech recognition and understanding , 1998, ICSLP.

[5] Georges Linarès,et al. Principes et performances du décodeur parole continue Speeral , 2002 .

[6] Ye-Yi Wang,et al. Spoken language understanding , 2005, IEEE Signal Processing Magazine.

[7] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..