Improved Large Vocabulary Continuous Chinese Speech Recognition by Character-Based Consensus Networks

Word-based consensus networks have been verified to be very useful in minimizing word error rates (WER) for large vocabulary continuous speech recognition for western languages. By considering the special structure of Chinese language, this paper points out that character-based rather then word-based consensus networks should work better for Chinese language. This was verified by extensive experimental results also reported in the paper.

[1]  Hermann Ney,et al.  Using posterior word probabilities for improved speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2]  Mitch Weintraub,et al.  Explicit word error minimization in n-best list rescoring , 1997, EUROSPEECH.

[3]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[4]  Frank K. Soong,et al.  Tone-Enhanced Generalized Character Posterior Probability (GCPP) for Cantonese LVCSR , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  Satoshi Nakamura,et al.  Optimal acoustic and language model weights for minimizing word verification errors , 2004, INTERSPEECH.

[6]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  F. K. Soong Generalized word posterior probability (GWPP) for measuring reliability of recognized words , 2004 .