Speech enhancement based on AR model parameters estimation

Speech and noise codebooks are trained as a priori information for speech enhancement.EM algorithm is employed to estimate AR model gains of speech and noise.The ambiguity problem can be reduced by using the KNN rule.We propose a posteriori SPP estimation method by applying sigmoid function.The residual noise between harmonics of voiced speech is removed. In this paper, we propose a speech and noise auto-regressive (AR) model parameters estimation method under noisy conditions used for speech enhancement, which exploits a priori information about speech and noise spectral shapes (parameterized as AR coefficients) described by trained codebooks. The expectation maximization (EM) algorithm is first employed to obtain AR gains of speech and noise, which correspond to each pair of codebook entries of speech and noise spectral shapes. Then the K-nearest neighbor (KNN) rule is used to select some candidates from the optimized AR parameters (AR coefficients and AR gains) of speech and noise for constructing the weighted Wiener filter (WWF). Furthermore, by using sigmoid function, we propose a posteriori speech-presence probability (SPP) estimation method. Combining the a posteriori SPP with the WWF, the residual noise of enhanced speech is effectively reduced. The test results demonstrate the performance superiority of the proposed speech enhancement scheme compared to the reference methods.

[1]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[2]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[3]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[4]  Yariv Ephraim Gain-adapted hidden Markov models for recognition of clean and noisy speech , 1992, IEEE Trans. Signal Process..

[5]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[6]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[7]  W. Bastiaan Kleijn,et al.  Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Jhing-Fa Wang,et al.  Speech presence probability estimation based on integrated time-frequency minimum tracking for speech enhancement in adverse environments , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Jian Yang,et al.  K Nearest Neighbor Based Local Sparse Representation Classifier , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[10]  Susanto Rahardja,et al.  /spl beta/-order MMSE spectral amplitude estimation for speech enhancement , 2005, IEEE Transactions on Speech and Audio Processing.

[11]  W. Bastiaan Kleijn,et al.  Speech enhancement using a-priori information , 2003, INTERSPEECH.

[12]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[13]  W. Bastiaan Kleijn,et al.  Codebook driven short-term predictor parameter estimation for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Israel Cohen,et al.  Simultaneous Detection and Estimation Approach for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  David Malah,et al.  Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[16]  Henning Puder,et al.  Improving Robustness of Codebook-Based Noise Estimation Approaches With Delta Codebooks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[18]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[19]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[20]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[21]  Changchun Bao,et al.  Speech enhancement using generalized weighted β-order spectral amplitude estimator , 2014, Speech Commun..

[22]  Weibin Zhang,et al.  Estimating Speech Spectral Amplitude Based on the Nakagami Approximation , 2014, IEEE Signal Processing Letters.

[23]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[24]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..