Task independent and dependent training: performance comparison of HMM and hybrid HMM/MLP approaches

Compares speaker independent isolated word recognition performance obtained with standard phonemic hidden Markov models (HMMs) and hybrid approaches using a multilayer perceptron (MLP) to estimate the HMM emission probabilities. This latter approach has previously been shown particularly effective on a large vocabulary, speaker independent, continuous speech recognition task (i.e., ARPA Resource Management) by using simple context-independent phoneme models and single pronunciation word models. As a consequence, the main goal of the paper is to compare the performance which can be achieved by the different approaches for both task dependent and independent training.<<ETX>>

[1]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[2]  Hynek Hermansky,et al.  Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP) , 1991, EUROSPEECH.

[3]  Ronald A. Cole,et al.  A telephone speech database of spelled and spoken names , 1992, ICSLP.

[4]  Hervé Bourlard,et al.  A Continuous Speech Recognition System Embedding MLP into HMM , 1989, NIPS.

[5]  Hervé Bourlard,et al.  Probability estimation by feed-forward networks in continuous speech recognition , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[6]  Steve Renals,et al.  Connectionist probability estimation in the DECIPHER speech recognition system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Jeff A. Bilmes,et al.  The RAP: a ring array processor for layered network calculations , 1990, [1990] Proceedings of the International Conference on Application Specific Array Processors.

[8]  Hervé Bourlard,et al.  Continuous speech recognition by connectionist statistical methods , 1993, IEEE Trans. Neural Networks.