Using posterior word probabilities for improved speech recognition

In this paper we present a new scoring scheme for speech recognition. Instead of using the joint probability of a word sequence and a sequence of acoustic observations, we determine the best path through a word graph using posterior word probabilities. These probabilities are computed beforehand with a modified forward-backward algorithm. It is important to note that during the search for the best path no language model is needed because it is already considered in the posterior word probabilities. Subsequent modules can thus process these word graphs very efficiently. Also, confidence measures can be computed from the posterior word probabilities with no additional cost. We present experimental results on five corpora, the Dutch Arise corpus, the German Verbmobil '98 corpus, the English North American Business '94 20 k and 64 k development corpora, and the English Broadcast News '96 corpus. The relative reduction in word error rate ranges between 1.5% and 5.0%.

[1]  Hermann Ney,et al.  The RWTH large vocabulary continuous speech recognition system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Thomas Bub,et al.  VERBMOBIL: the evolution of a complex large speech-to-speech translation system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Ralf Schlüter,et al.  Using word probabilities as confidence measures , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Hermann Ney,et al.  A comparison of word graph and n-best list based confidence measures , 1999, EUROSPEECH.

[5]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..