论文信息 - Performance of speech recognition devices: evaluating speech produced over the telephone network

Performance of speech recognition devices: evaluating speech produced over the telephone network

A performance evaluation facility has been designed to facilitate rapid evaluation of representative speech databases on multiple ASR (automatic speech recognition) devices. The results of an evaluation session are collected and processed automatically. The performance of four ASR devices is reported: three isolated-word and one continuous-speech. Recognition accuracy per string has been tabulated, and patterns of insertion, deletion, and substitution errors are presented. On the word level, the continuous-digit recognizer performed better on continuous-digit strings than the isolated-word recognizers on the isolated-digit strings. ASR performance on the VOIS (voice-operated intercept services) database was considerably worse than performance on the TI digits. Using the VOIS database, the best performance achieved by an isolated-word device was 75.9% word accuracy. The continuous speech ASR achieved 80.7% word accuracy and 35.2% string accuracy. The lower accuracy rates on VOIS reflect the realistic conditions under which the digits were obtained. This argues for assessing ASR devices under actual applications conditions.<<ETX>>

[1] David S Pallet. Performance assessment of automatic speech recognizers , 1985 .

[2] R. G. Leonard,et al. A database for speaker-independent digit recognition , 1984, ICASSP.