A formant extraction method using autocorrelation domain inverse filtering and focusing method

An analysis-by-synthesis formant extraction method which uses autocorrelation-domain inverse filtering (ADIFs) and multistep focusing is proposed. A cascade of second-order ADIFs, each of which corresponds to a formant, is used. This method significantly reduces computational cost, compared with the conventional analysis-by-synthesis formant extraction method which uses frequency-domain inverse filtering or time-domain inverse filtering. A human speech production model, specifically the glottal pole model, has been implemented in this system. Formant extraction experiments for synthesized speech and natural speech have proved the robustness of this method. The real-time formant system can be realized by a single chip signal processor.<<ETX>>

[1]  Elliot N. Pinson,et al.  Pitch‐Synchronous Time‐Domain Estimation of Formant Frequencies and Bandwidths , 1962 .

[2]  Y. Mitome,et al.  Japanese speech synthesis system in a book reader for the blind , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  James L. Flanagan Automatic Extraction of Formant Frequencies from Continuous Speech , 1955 .

[4]  F. Itakura,et al.  A statistical method for estimation of speech spectral density and formant frequencies , 1970 .

[5]  J. Olive Automatic Formant Tracking by a Newton-Raphson Technique , 1971 .

[6]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[7]  K. Stevens,et al.  Reduction of Speech Spectra by Analysis‐by‐Synthesis Techniques , 1961 .

[8]  L. Rabiner,et al.  System for automatic formant analysis of voiced speech. , 1970, The Journal of the Acoustical Society of America.

[9]  Takao Nishitani,et al.  Advanced single-chip signal processor , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.