论文信息 - Analysis/synthesis of speech based on an adaptive quasi-harmonic plus noise model

Analysis/synthesis of speech based on an adaptive quasi-harmonic plus noise model

Decomposition of speech into a deterministic part and a stochastic part is a typical modeling. Usually, the deterministic part in voiced speech is modeled as a sum of time-varying sinusoids while the stochastic part is modeled as modulated noise. The estimation of sinusoidal parameters assumes that locally speech is a stationary signal. However, this is not true leading to biased amplitude and phase estimation. In this paper, we develop a scheme for speech analysis and synthesis which is able to deal with locally nonstationary frames. Thus, deterministic part it modeled using an adaptive quasi-harmonic model while stochastic part is modeled as time-modulated and frequency-modulated noise. Results show that the reconstructed signal is almost indistinguishable from the original.

Yannis Stylianou | Olivier Rosec | Yannis Pantazis | Georgios Tzedakis

[1] Axel Röbel. Parameter estimation for linear AM/FM sinusoids using frequency domain demodulation , 2007, SIP.

[2] Yannis Stylianou,et al. AM-FM estimation for speech based on a time-varying sinusoidal model , 2009, INTERSPEECH.

[3] Yannis Stylianou,et al. Improving the modeling of the noise part in the harmonic plus noise model of speech , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[5] T. F. Quatieri,et al. Audio Signal Processing Based on Sinusoidal Analysis/Synthesis , 2002 .

[6] Jae S. Lim,et al. Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[7] Eric Moulines,et al. High-quality speech modification based on a harmonic + noise model , 1995, EUROSPEECH.

[8] Carl de Boor,et al. A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[9] Yannis Stylianou,et al. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .

[10] Luis Weruaga,et al. The fan-chirp transform for non-stationary harmonic signals , 2007, Signal Process..

[11] Yannis Stylianou,et al. On the properties of a time-varying quasi-harmonic model of speech , 2008, INTERSPEECH.

[12] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[13] Xavier Serra,et al. A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition , 1989 .