Multiplicative Update of Auto-Regressive Gains for Codebook-Based Speech Enhancement

In this paper, we present a novel method for estimating short-term linear predictive parameters of speech and noise in the codebook-driven Wiener filtering speech enhancement method. We only use pretrained spectral shape codebook of speech to model the a priori information about linear predictive coefficients of speech, and the spectral shape of noise is estimated online directly instead of using noise codebook to solve the problem of noise classification. Differing from the existing codebook-driven methods that the linear predictive gains of speech and noise are estimated by maximum-likelihood method, in the proposed method we exploit a multiplicative update rule to estimate the linear predictive gains more accurately. The estimated gains can help to reserve more speech components in the enhanced speech. Meanwhile, the Bayesian parameter-estimator without the noise codebook is also developed. Moreover, we develop an improved codebook-driven Wiener filter combined with the speech-presence probability, so that the proposed method achieves the goal of removing the residual noise between the harmonics of noisy speech.

[1]  Hamid Sheikhzadeh,et al.  HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[2]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[3]  Qi He,et al.  Codebook-based speech enhancement using Markov process and speech-presence probability , 2015, INTERSPEECH.

[4]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[5]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[6]  W. Bastiaan Kleijn,et al.  HMM-Based Gain Modeling for Enhancement of Speech in Noise , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[8]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[9]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[10]  Israel Cohen,et al.  Simultaneous Detection and Estimation Approach for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[12]  W. Bastiaan Kleijn,et al.  Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[14]  Yariv Ephraim,et al.  A Bayesian estimation approach for speech enhancement using hidden Markov models , 1992, IEEE Trans. Signal Process..

[15]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[16]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[17]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[18]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[19]  W. Bastiaan Kleijn,et al.  Codebook-based Bayesian speech enhancement , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20]  W. Bastiaan Kleijn,et al.  Codebook driven short-term predictor parameter estimation for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  W. Bastiaan Kleijn,et al.  Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Thippur V. Sreenivas,et al.  Codebook constrained Wiener filtering for speech enhancement , 1996, IEEE Trans. Speech Audio Process..

[23]  W. Bastiaan Kleijn,et al.  Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[24]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[25]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[26]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[27]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[28]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..