Recognizer performance in telephone-based spoken dialogue systems may be strongly affected by the transmission channel. In order to investigate the impact of different parts of the transmission channel in more detail, a simulation model is presented. It implements all transmission characteristics of modern telephone networks, based on instrumentally measurable values as they are used by network planners. The simulation shows real-time capability and runs on a programmable DSP-based hardware. It can be used for a systematic investigation of recognizer performance as a function of transmission channel degradations, for producing training material with specified transmission characteristics, or for estimating the impact of transmission impairments on dialogue flow and system usability. The impact of transmission channel characteristics on the performance of a speech recognizer integrated in an interactive voice server is analyzed in more detail. It turns out that specific transmission characteristics may lead to a recognition degradation which otherwise would not have been expected from the standard training material. An outlook is given on future extensions of the simulation model, in order to better cover effects of mobile and IP-based telephone systems.
[1]
Sebastian Möller,et al.
Assessment and Prediction of Speech Quality in Telecommunications
,
2000
.
[2]
Peter J. Wyard.
The relative importance of the factors affecting recogniser performance with telephone speech
,
1993,
EUROSPEECH.
[3]
Stephan Euler,et al.
The influence of speech coding algorithms on automatic speech recognition
,
1994,
Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[4]
Daniele Falavigna,et al.
Use of simulated data for robust telephone speech recognition
,
1999,
EUROSPEECH.