论文信息 - Automatic formant extraction utilizing mel scale and equal loudness contour

Automatic formant extraction utilizing mel scale and equal loudness contour

A method is described which calculates poles to be called 'auditory formants' assuming that the number of pole pairs of auditory significance (or of higher Q factor of resonance) is three. The speech wave is first analyzed by linear prediction (LP) method with 14 (or 20) predictor coefficients. The frequency scale of the calculated power spectrum is transformed into "mel scale" and the amplitude is weighted according to "equal loudness contour". The processed power spectrum is transformed into autocorrelation function by inverse-Fourier-transform. LP-analysis is again performed to obtain three conjugate pole pairs utilizing the first six terms of the autocorrelation. The pole frequencies give the mel formant frequencies. Several experimental results are shown.

Shuichi Itahashi | Shoichi Yokoyama

[1] F. Itakura,et al. A statistical method for estimation of speech spectral density and formant frequencies , 1970 .

[2] 板橋秀一,et al. Automatic Formant Trajectory Tracking and Its Approximation by Second Order Linear System , 1975 .