Estimation of the instantaneous harmonic parameters of speech

This paper describes a method of accurate estimation of the instantaneous speech signal harmonic parameters. The method is based on adaptive filtering of the speech signal along its harmonic components. A simple way of filter synthesis based on the Fourier transform is also proposed. The synthesized filters have a closed form impulse response which can be modulated in frequency domain to achieve better performance for components with high frequency alteration. This method is also applicable to give an accurate estimate of the fundamental frequency of speech.

[1]  Mark J. T. Smith,et al.  Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model , 1997, IEEE Trans. Speech Audio Process..

[2]  Luís B. Almeida,et al.  Nonstationary spectral modeling of voiced speech , 1983 .

[3]  Andreas Spanias,et al.  Speech coding: a tutorial review , 1994, Proc. IEEE.

[4]  Alexander A. Petrovsky,et al.  Accurate speech decomposition into periodic and aperiodic components based on Discrete Harmonic Transform , 2007, 2007 15th European Signal Processing Conference.

[5]  Yannis Stylianou,et al.  Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[6]  Masaaki Honda,et al.  Sinusoidal model based on instantaneous frequency attractors , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[8]  Qin Yan,et al.  Noisy Speech Enhancement Using Harmonic-Noise Model and Codebook-Based Post-Processing , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Takao Kobayashi,et al.  Harmonics tracking and pitch extraction based on instantaneous frequency , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..