Automatic formant extraction utilizing mel scale and equal loudness contour

A method is described which calculates poles to be called 'auditory formants' assuming that the number of pole pairs of auditory significance (or of higher Q factor of resonance) is three. The speech wave is first analyzed by linear prediction (LP) method with 14 (or 20) predictor coefficients. The frequency scale of the calculated power spectrum is transformed into "mel scale" and the amplitude is weighted according to "equal loudness contour". The processed power spectrum is transformed into autocorrelation function by inverse-Fourier-transform. LP-analysis is again performed to obtain three conjugate pole pairs utilizing the first six terms of the autocorrelation. The pole frequencies give the mel formant frequencies. Several experimental results are shown.