The second-order derivatives of MFCC for improving spoken Arabic digits recognition using Tree distributions approximation model and HMMs

Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in many speech and speaker recognition applications. In this paper, we study the effect of the second-order derivatives of MFCC on the recognition of the Spoken Arabic digits. The system was developed using the Hidden Markov Models (HMMs) and Tree distribution approximation model. Experimentally it has been shown that, the second-order derivatives of MFCC parameters compared to the MFCC yield improved rates of 4.60% for CHMM. We were able to reach an overall recognition accuracy of 98.41%, which is satisfactory compared to previous work on spoken Arabic digits speech recognition.

[1]  Abderrahmane Amrouche,et al.  An efficient speech recognition system in adverse conditions using the nonparametric regression , 2010, Eng. Appl. Artif. Intell..

[2]  Xiaodong He,et al.  Discriminative Learning for Speech Recognition: Theory and Practice , 2008, Discriminative Learning for Speech Recognition.

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Nacereddine Hammami,et al.  Tree distribution classifier for automatic spoken Arabic digit recognition , 2009, 2009 International Conference for Internet Technology and Secured Transactions, (ICITST).

[5]  Y.A. Alotaibi,et al.  Spoken Arabic digits recognizer using recurrent neural networks , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[6]  Nacereddine Hammami,et al.  Improved tree model for arabic speech recognition , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[7]  M. Bedda,et al.  HMM parameters estimation based on cross-validation for Spoken Arabic Digits recognition , 2011, 2011 International Conference on Communications, Computing and Control Applications (CCCA).

[8]  Danoush Hosseinzadeh,et al.  Combining Vocal Source and MFCC Features for Enhanced Speaker Recognition Performance Using GMMs , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[9]  Khalid Saeed,et al.  A Speech-and-Speaker Identification System: Feature Extraction, Description, and Classification of Speech-Signal Image , 2007, IEEE Transactions on Industrial Electronics.

[10]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[11]  Marwan Al-Zabibi An acoustic-phonetic approach in automatic arabic speech recognition , 1990 .

[12]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .