Explicit word error minimization using word hypothesis posterior probabilities

We introduce a new concept, the time frame error rate. We show that this error rate is closely correlated with the word error rate and use it to overcome the mismatch between Bayes' decision rule which aims at minimizing the expected sentence error rate and the word error rate which is used to assess the performance of speech recognition systems. Based on the time frame errors we derive a new decision rule and show that the word error rate can be reduced consistently with it on various recognition tasks. All stochastic models are left completely unchanged. We present experimental results on five corpora, the Dutch Arise corpus, the German Verbmobil '98 corpus, the English North American Business '94 20k and 64k development corpora, and the English Broadcast News '96 corpus. The relative reduction of the word error rate ranges from 2.3% to 5.1%.

[1]  Vaibhava Goel,et al.  Minimum Bayes-risk automatic speech recognition , 2000, Comput. Speech Lang..

[2]  Hermann Ney,et al.  Using posterior word probabilities for improved speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Ralf Schlüter,et al.  Using word probabilities as confidence measures , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Hermann Ney,et al.  A comparison of word graph and n-best list based confidence measures , 1999, EUROSPEECH.

[5]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[6]  Mitch Weintraub,et al.  Explicit word error minimization in n-best list rescoring , 1997, EUROSPEECH.

[7]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[8]  Thomas Bub,et al.  VERBMOBIL: the evolution of a complex large speech-to-speech translation system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.