Modeling speech signals in the time-frequency domain using GARCH

In this paper, we introduce a novel modeling approach for speech signals in the short-time Fourier transform (STFT) domain. We define the conditional variance of the STFT expansion coefficients, and model the one-frame-ahead conditional variance as a generalized autoregressive conditional heteroscedasticity (GARCH) process. The proposed approach offers a reasonable model on which to base the estimation of the variances of the STFT expansion coefficients, while taking into consideration their heavy-tailed distribution.

[1]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[2]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[3]  Peter Vary,et al.  Multichannel speech enhancement using Bayesian spectral amplitude estimation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Israel Cohen,et al.  Speech enhancement for non-stationary noise environments , 2001, Signal Process..

[5]  Steven F. Boll,et al.  Optimal estimators for spectral restoration of noisy speech , 1984, ICASSP.

[6]  T. Bollerslev,et al.  Generalized autoregressive conditional heteroskedasticity , 1986 .

[7]  Rainer Martin,et al.  Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[9]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[10]  S. Gazor,et al.  Speech probability distribution , 2003, IEEE Signal Processing Letters.

[11]  R. Chou,et al.  ARCH modeling in finance: A review of the theory and empirical evidence , 1992 .

[12]  Israel Cohen,et al.  Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models , 2006, Signal Process..