Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive

The Aurora 2 database may be used as a benchmark for evaluation of algorithms under noisy conditions. In particular, the clean training/noisy test mode is aimed at evaluating models that are trained on clean data only without further adjustments on the noisy data, i.e. undr severe mismatch between the trainig and test conditions. While several researchers proposed techniques at the front-end level to improve recognition performance over the reference hideen Markov model (HMM) baseline, investigations at the back-end level are sought. In this respect, the goal is to develop acoustic models that are intrinsically less noise sensitive. This paper presents the word accuracy yielded by a non-parametric HMM with connectionist estimates of the emission probabilities, i.e. a neural network is applied instead of the usual parametric (Gaussian mixture) probability densities. A regularization technique, relying on a maximum-likelihood parameter grouping algorithm, is explicitly introduced to increase the generalization capability of the model and, in turn, its noise-robustness. Results show that a relative word error rate reduction w.r.t. the Gaussianmixture HMM is obtained by averaging over the different noises and SNRs of Aurora 2 test set A.

[1]  Hervé Bourlard,et al.  Connectionist speech recognition , 1993 .

[2]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[3]  Denis Jouvet,et al.  Evaluation of a noise-robust DSR front-end on Aurora databases , 2002, INTERSPEECH.

[4]  Marco Gori,et al.  Toward noise-tolerant acoustic models , 2001, INTERSPEECH.

[5]  Maurizio Omologo,et al.  Speaker independent continuous speech recognition using an acoustic-phonetic Italian corpus , 1994, ICSLP.

[6]  Marco Gori,et al.  Robust combination of neural networks and hidden Markov models for speech recognition , 2003, IEEE Trans. Neural Networks.

[7]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[8]  Marco Gori,et al.  A survey of hybrid ANN/HMM models for automatic speech recognition , 2001, Neurocomputing.

[9]  Marco Gori,et al.  Continuous Speech Recognition with a Robust Connectionist/Markovian Hybrid Model , 2001, ICANN.

[10]  John S. Bridle,et al.  Alpha-nets: A recurrent 'neural' network architecture with a hidden Markov model interpretation , 1990, Speech Commun..

[11]  Yoshua Bengio,et al.  Neural networks for speech and sequence recognition , 1996 .

[12]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[13]  Edmondo Trentin,et al.  Networks with trainable amplitude of activation functions , 2001, Neural Networks.