Improvement of the speech recognition in noisy environments using a nonparametric regression

In this paper, an efficient speech recognition system based on the general regression neural network (GRNN) has been presented. The GRNN has been previously applied for phoneme identification and isolated word recognition in quiet environment. We propose to extend this method to Arabic spoken word recognition in adverse conditions because noise robustness is one of the most challenging problems in automatic speech recognition (ASR). The proposed system has been tested for Arabic digit recognition at different signal-to-noise ratio (SNR) levels in various noisy conditions, including stationary and nonstationary background noises issued from NOISEX-92 database. The proposed scheme is compared with the similar recognisers based on the multilayer perceptron (MLP), the Elman recurrent neural network (RNN) and the discrete hidden Markov model (HMM). The experimental results show that the use of the neural network approach including nonparametric regression improves the global performance of the speech recogniser in noisy environments.

[1]  Ling Guan,et al.  A neural network approach for human emotion recognition in speech , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[2]  J.M. Rouvaen,et al.  Arabic isolated word recognition using general regression neural network , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[3]  Leszek Rutkowski,et al.  Generalized regression neural networks in time-varying environment , 2004, IEEE Transactions on Neural Networks.

[4]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[5]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  B. Bolat,et al.  Speeh/music classification by using statistical neural networks , 2004, Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, 2004..

[8]  Abeer Alwan,et al.  Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  A. G. Constantinides,et al.  An heuristic pattern correction scheme for GRNNs and its application to speech recognition , 1998, Neural Networks for Signal Processing VIII. Proceedings of the 1998 IEEE Signal Processing Society Workshop (Cat. No.98TH8378).

[10]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11]  Khalid Saeed,et al.  Heuristic Method of Arabic Speech Recognition , 2005 .

[12]  Donald F. Specht,et al.  Probabilistic neural networks and general regression neural networks , 1996 .

[13]  John H. L. Hansen,et al.  A new perspective on feature extraction for robust in-vehicle speech recognition , 2003, INTERSPEECH.

[14]  Hervé Bourlard,et al.  Connectionist probability estimators in HMM speech recognition , 1994, IEEE Trans. Speech Audio Process..

[15]  Christoph Neukirchen,et al.  A New Approach to Hybrid HMM/ANN Speech Recognition using Mutual Information Neural Networks , 1996, NIPS.

[16]  Richard Lippmann,et al.  Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[17]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[18]  Andrew C. Morris,et al.  A comparison of two strategies for ASR in additive noise: missing data and spectral subtraction , 1999, EUROSPEECH.

[19]  Omar Farooq,et al.  Exploitation of Morphological Structures in Large Vocabulary Arabic Speech Recognition , 2005, Int. J. Comput. Process. Orient. Lang..

[20]  Yousef Ajami Alotaibi Investigating spoken Arabic digits in speech recognition setting , 2005, Inf. Sci..

[21]  T. Cacoullos Estimation of a multivariate density , 1966 .

[22]  H. Bourouba,et al.  New Hybrid System (Supervised Classifier/HMM) for Isolated Arabic Speech Recognition , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[23]  Mokhtar Sellami,et al.  Arabic Word Recognition by Classifiers and Context , 2005, Journal of Computer Science and Technology.

[24]  Shahid Masud,et al.  Application of concurrent generalized regression neural networks for arabic speech recognition , 2004, Neural Networks and Computational Intelligence.

[25]  Mokhtar Sellami,et al.  Connectionist Probability Estimators in HMM Arabic Speech Recognition Using Fuzzy Logic , 2003, MLDM.

[26]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .