Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.

[1]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[2]  P. Sander Decision and estimation theory , 1980 .

[3]  Israel Cohen,et al.  Speech enhancement for non-stationary noise environments , 2001, Signal Process..

[4]  David Malah,et al.  Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  N. D. Wallace,et al.  Computer generation of gamma random variates with non-integral shape parameters , 1974, Commun. ACM.

[6]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[7]  Jae S. Lim,et al.  The unimportance of phase in speech enhancement , 1982 .

[8]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[9]  Peter Vary,et al.  Noise suppression by spectral magnitude estimation —mechanism and theoretical limits— , 1985 .

[10]  S. Godsill,et al.  Simple alternatives to the Ephraim and Malah suppression rule for speech enhancement , 2001, Proceedings of the 11th IEEE Signal Processing Workshop on Statistical Signal Processing (Cat. No.01TH8563).

[11]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[12]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[13]  Joerg Bitzer,et al.  Post-Filtering Techniques , 2001, Microphone Arrays.

[14]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[15]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[16]  Susanto Rahardja,et al.  Adaptive /spl beta/-order MMSE estimation for speech enhancement , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[17]  H. Brehm,et al.  Description and generation of spherically invariant speech-model signals , 1987 .

[18]  Rainer Martin,et al.  SPEECH ENHANCEMENT IN THE DFT DOMAIN USING LAPLACIAN SPEECH PRIORS , 2003 .

[19]  강재성,et al.  Silicon Valley , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[20]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[21]  I. S. Gradshteyn,et al.  Table of Integrals, Series, and Products , 1976 .

[22]  T. Lotter NOISE REDUCTION BY MAXIMUM A POSTERIORI SPECTRAL AMPLITUDE ESTIMATION WITH SUPERGAUSSIAN SPEECH MODELING , 2003 .

[23]  Simon J. Godsill,et al.  Efficient Alternatives to the Ephraim and Malah Suppression Rule for Audio Signal Enhancement , 2003, EURASIP J. Adv. Signal Process..

[24]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[25]  Rainer Martin,et al.  Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[27]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[28]  Sharon Gannot,et al.  Speech enhancement using a mixture-maximum model , 1999, IEEE Trans. Speech Audio Process..

[29]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[30]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .