A family of formant trackers based on hidden Markov models

This paper describes a family of formant trackers based on hidden Markov models and vector quantization of LPC spectra. Two general classes of models are presented, differing in whether formants are tracked singly or jointly. The states of a single-formant model are scalar values corresponding to possible formant frequencies. The states of a multi-formant model are frequency vectors defining possible formant configurations. Formant detection and estimation are performed simultaneously using the forward-backward algorithm. Model parameters are estimated from hand-marked formant tracks. The models have been evaluated using portions of the Texas Instruments multi-dialect connected digits database. The most accurate configurations exhibited root mean square estimation errors of about 70 Hz, 95 Hz, and 140 Hz, for F1, F2and F3, respectively.

[1]  G. Kopec Formant tracking using hidden Markov models , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Gary E. Kopec The integrated signal processing system ISP , 1984, ICASSP.

[3]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[4]  Gary E. Kopec Formant tracking using hidden Markov models and vector quantization , 1986, IEEE Trans. Acoust. Speech Signal Process..

[5]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[6]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.