A Comparative Study of Feature Extraction Methods Applied to Continuous Speech Recognition in Romanian Language

This paper describes continuous speech recognition experiments on a Romanian language speech database, by using hidden Markov models (EMM). We compare the recognition rates obtained in our ASR system realising front-ends based on features extracted by perceptual variants of cepstral analysis and linear prediction and by simple linear prediction. The best results obtained with 36 coefficients mel-frequency cepstral coefficients (MFCC) are used as basis to rank the front-ends based on LPC. The second rank is very promising for the performance obtained with 5 perceptual linear prediction (PLP) coefficients, obviously better at the last ranked performance of the simple linear prediction coefficients (LPC). We reorganized the database as follows: one database for male speakers, one database for female speakers and one database for both male and female speakers

[1]  Tao Chen,et al.  Speaker selection training for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  古井 貞煕,et al.  Digital speech processing, synthesis, and recognition , 1989 .

[3]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[4]  Silke Goronzy,et al.  Robust Adaptation to Non-Native Accents in Automatic Speech Recognition , 2002, Lecture Notes in Computer Science.

[5]  I. Gavat,et al.  Features Extraction, Modeling and Training Strategies in Continuous Speech Recognition for Romanian Language , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[6]  A. Nejat Ince,et al.  Digital Speech Processing , 1992 .