On the Use of the Formant Features in the Dynamic Time Warping Based Recognition of Isolated Words

A possibility to use the formant features (FF) in the user-dependent isolated word recognition has been investigated. The word recognition was performed using a dynamic time-warping technique. Several methods of the formant feature extraction were compared and a method based on the singular prediction polynomials has been proposed for the recognition of isolated words. Recognition performance of the proposed method was compared to that of the linear prediction coding (LPC) and LPC-derived cepstral features (LPCC). In total, 111 Lithuanian words were used in the recognition experiment. The recognition performance was evaluated at various noise levels. The experiments have shown that the formant features calculated from the singular prediction polynomials are more reliable than the LPC and LPCC features at all noise levels.

[1]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[2]  Samy Bengio,et al.  HMM2- Extraction of Formant Features and their Use for Robust ASR , 2001 .

[3]  Akira Watanabe,et al.  Formant estimation method using inverse-filter control , 2001, IEEE Trans. Speech Audio Process..

[4]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[5]  Tetsuya Shimamura,et al.  NOISE ROBUST FORMANT FREQUENCY ESTIMATION BASED ON COMPLEX AUTOCORRELATION FUNCTION , 2002 .

[6]  S. McCandless,et al.  An algorithm for automatic formant extraction using linear prediction spectra , 1974 .

[7]  Samy Bengio,et al.  HMM2- extraction of formant structures and their use for robust ASR , 2001, INTERSPEECH.

[8]  Laima Grumadiene,et al.  Frequency dictionary of modern written Lithuanian , 1998 .

[9]  L. Marple A new autoregressive spectrum analysis algorithm , 1980 .

[10]  F. Milinazzo,et al.  Formant location from LPC analysis data , 1993, IEEE Trans. Speech Audio Process..

[11]  Philip N. Garner,et al.  On the robust incorporation of formant features into hidden Markov models for automatic speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Laimutis Telksnys,et al.  Development of Isolated Word Speech Recognition System , 2002, Informatica.

[13]  Samy Bengio,et al.  Evaluation of formant-like features for ASR , 2002, INTERSPEECH.

[14]  Hermann Ney,et al.  Formant estimation for speech recognition , 1998, IEEE Trans. Speech Audio Process..

[15]  Raymond N. J. Veldhuis,et al.  Extraction of vocal-tract system characteristics from speech signals , 1998, IEEE Trans. Speech Audio Process..

[16]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[17]  Philippe Delsarte,et al.  On the splitting of classical algorithms in linear prediction theory , 1987, IEEE Trans. Acoust. Speech Signal Process..

[18]  J. Markel Digital inverse filtering-a new tool for formant trajectory estimation , 1972 .

[19]  Hermann Ney,et al.  A model for efficient formant estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[20]  Philippe Delsarte,et al.  The split Levinson algorithm , 1986, IEEE Trans. Acoust. Speech Signal Process..

[21]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[22]  Alex Acero,et al.  Formant analysis and synthesis using hidden Markov models , 1999, EUROSPEECH.

[23]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[24]  Samy Bengio,et al.  Evaluation of formant-like features on an automatic vowel classification task. , 2004, The Journal of the Acoustical Society of America.

[25]  William H. Press,et al.  Numerical recipes in C , 2002 .

[26]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[27]  Kalina Bontcheva,et al.  Human Language Technologies , 2009, Semantic Knowledge Management.

[28]  Monson H. Hayes,et al.  Statistical Digital Signal Processing and Modeling , 1996 .

[29]  Lou Boves,et al.  Comparing acoustic features for robust ASR in fixed and cellular network applications , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[30]  Philip N. Garner,et al.  Using formant frequencies in speech recognition , 1997, EUROSPEECH.

[31]  L. F. Willems,et al.  Robust formant analysis for speech synthesis applications , 1987, ECST.