Decomposing speech into formants: A new look at an old problem

We describe in this paper a method of decomposing a speech wave into a number of elementary signals representing the contribution of the individual vocal‐tract resonances. The decomposition is achieved by filtering out all but the desired resonance component from the speech wave on the basis of a suitably defined criterion of smoothness. The method assumes that the waveforms of the individual resonances can be distinguished according to the smoothness criterion. The parameters of the filter vary with time according to the changing character of the vocal tract resonances. Typically, the filter coefficients are adjusted once every pitch period. The number of filter coefficients equals the number of complex resonances of the vocal tract. In addition to excitation parameters, each elementary signal is described by four parameters representing the frequency, amplitude, bandwidth, and phase of a particular formant. The method provides an automatic method of extracting formant parameters for speech recognition a...