A model for efficient formant estimation

This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and segment boundaries that optimally match the spectrum. The main results of this paper are: (1) modeling formants by digital resonators allows a reliable estimation of formant frequencies; (2) digital resonators can be used efficiently in connection with dynamic programming; and (3) a recognition test with formant frequencies results in a string error rate of 4.8% on the adult corpus of the TI digit string database.

[1]  Hermann Ney,et al.  Connected digit recognition using statistical template matching , 1995, EUROSPEECH.

[2]  Stuart E. Dreyfus,et al.  Applied Dynamic Programming , 1965 .

[3]  F. Milinazzo,et al.  Formant location from LPC analysis data , 1993, IEEE Trans. Speech Audio Process..

[4]  John S. D. Mason,et al.  Deriving articulatory representations of speech , 1995, EUROSPEECH.

[5]  John S. D. Mason,et al.  Deriving articulatory representations from speech with various excitation modes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  N. Sedgwick,et al.  A method for segmenting acoustic patterns, with applications to automatic speech recognition , 1977 .

[7]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[8]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[9]  M. Bush,et al.  Network-based connected digit recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[10]  M. Hunt,et al.  Speaker dependent and independent speech recognition experiments with an auditory model , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Gary E. Kopec Formant tracking using hidden Markov models and vector quantization , 1986, IEEE Trans. Acoust. Speech Signal Process..

[12]  M. Jack,et al.  Globally optimising formant tracker using generalised centroids , 1987 .