"CrossTalk": technical challenge to VAD-like applications in mixed landline and mobile environments
暂无分享,去创建一个
The basic voice activated dialing (VAD) or VAD-like application allows a user to dial a telephone number by simply speaking a corresponding name. For the mobile environment, it provides additional value to mobile users when they are in both a hands-busy and an eyes-busy situation. However, this new application presents some unique technical challenges to the underlying automatic speech recognition (ASR) systems as well as to the service design if the users are allowed to access such VAD-like services in both landline and mobile environments because each environment possesses certain unique ASR-performance-impacting variables the other does not. In order to quantify the potential ASR performance degradation resulting from crosstalking over different environments, we have conducted a series of performance assessment tasks using the CrossTalk database collected from 44 speakers in St. Louis area. This paper describes the tasks and discuss their experimental results.
[1] H. Chang. Transducer effect on ASR applications over the telephone network , 1990, International Conference on Acoustics, Speech, and Signal Processing.