Robust Speech Recognition System Using Conventional and Hybrid Features of MFCC, LPCC, PLP, RASTA-PLP and Hidden Markov Model Classifier in Noisy Conditions

In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Hynek Hermansky,et al.  The challenge of inverse-E: the RASTA-PLP method , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[3]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[4]  Veton Kepuska,et al.  Performance Evaluation of Conventional and Hybrid Feature Extractions Using Multivariate HMM Classifier , 2015 .

[5]  V. Kepuska,et al.  A novel Wake-Up-Word speech recognition system, Wake-Up-Word recognition task, technology and evaluation , 2009 .

[6]  Namrata Dave,et al.  Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition , 2013 .

[7]  Rakesh Dugad,et al.  A Tutorial On Hidden Markov Models , 1996 .

[8]  M. Chetouani,et al.  Discriminative training for neural predictive coding applied to speech features extraction , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[9]  Abeer Alwan,et al.  On the use of variable frame rate analysis in speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10]  Hadi Veisi,et al.  Speech enhancement using hidden Markov models in Mel-frequency domain , 2013, Speech Commun..