Joint decoding for phoneme-grapheme continuous speech recognition

Standard ASR systems typically use phonemes as the subword units. Preliminary studies have shown that the performance of ASR systems could be improved by using graphemes as additional subword units. We investigate such a system where the word models are defined in terms of two different subword units, i.e., phoneme and grapheme. During training, models for both the subword units are trained, and then, during recognition, either both or just one subword unit is used. We have studied this system for a continuous speech recognition task in American English. Our studies show that grapheme information used along with phoneme information improves the performance of ASR.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[3]  Hervé Bourlard,et al.  Speech recognition with auxiliary information , 2004, IEEE Transactions on Speech and Audio Processing.

[4]  R. Cole,et al.  TELEPHONE SPEECH CORPUS DEVELOPMENT AT CSLU , 1998 .

[5]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[6]  Hermann Ney,et al.  Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Teuvo Kohonen,et al.  Speech recognition: a hybrid approach , 1998 .

[8]  Ronald A. Cole,et al.  Automatic time alignment of phonemes using acoustic-phonetic information , 2000 .

[9]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[10]  Hervé Bourlard,et al.  Using pitch frequency information in speech recognition , 2003, INTERSPEECH.

[11]  S. Bengio,et al.  Phoneme-grapheme based speech recognition system , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).