On phase importance in parameter estimation in single-channel speech enhancement

In this paper, we study the impact of exploiting the spectral phase information to further improve the speech quality of the single-channel speech enhancement algorithms. In particular, we focus on the two required steps in a typical single-channel speech enhancement system, namely: parameter estimation solved by a minimum mean square error (MMSE) estimator of the speech spectral amplitude, followed by signal reconstruction stage, where the observed noisy phase is often used. For the parameter estimation stage, in contrast to conventional Wiener filter, a new MMSE estimator is derived which takes into account the clean phase information as a prior information. In our experiments, we show that by including the phase information in the two steps, it is possible to improve the perceived signal quality of the enhanced signal significantly with respect to the methods that do not employ the phase information.

[1]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[2]  Rainer Martin,et al.  MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  John R. Hershey,et al.  Monaural speech separation and recognition challenge , 2010, Comput. Speech Lang..

[4]  Kuldip K. Paliwal,et al.  Group-delay-deviation based spectral analysis of speech , 2009, INTERSPEECH.

[5]  Ning Ma,et al.  The CHiME corpus: a resource and a challenge for computational hearing in multisource environments , 2010, INTERSPEECH.

[6]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[7]  Peter Vary,et al.  Noise suppression by spectral magnitude estimation —mechanism and theoretical limits— , 1985 .

[8]  Rainer Martin,et al.  Temporal smoothing of spectral masks in the cepstral domain for speech separation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  I. S. Gradshteyn,et al.  Table of Integrals, Series, and Products , 1976 .

[10]  Jae Lim,et al.  Signal estimation from modified short-time Fourier transform , 1984 .

[11]  D. Owen Handbook of Mathematical Functions with Formulas , 1965 .

[12]  Jesper Jensen,et al.  Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  W. Marsden I and J , 2012 .

[14]  Deep Sen,et al.  Iterative Phase Estimation for the Synthesis of Separated Sources From Single-Channel Mixtures , 2010, IEEE Signal Processing Letters.

[15]  Hamid Sheikhzadeh,et al.  HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[16]  Rainer Martin,et al.  Phase estimation for signal reconstruction in single-channel source separation , 2012, INTERSPEECH.

[17]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[18]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[19]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[20]  Kuldip K. Paliwal,et al.  The importance of phase in speech enhancement , 2011, Speech Commun..

[21]  Jae S. Lim,et al.  The unimportance of phase in speech enhancement , 1982 .

[22]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[23]  Kuldip K. Paliwal,et al.  Short-time phase spectrum in speech processing: A review and some experimental results , 2007, Digit. Signal Process..

[24]  W. Bastiaan Kleijn,et al.  Codebook driven short-term predictor parameter estimation for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  John H. L. Hansen,et al.  Speech enhancement using a constrained iterative sinusoidal model , 2001, IEEE Trans. Speech Audio Process..

[26]  Yi Hu,et al.  A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..

[27]  Rainer Martin,et al.  On Phase Importance in Parameter Estimation for Single-Channel Source Separation , 2012, IWAENC.

[28]  Timo Gerkmann,et al.  MMSE-Optimal Spectral Amplitude Estimation Given the STFT-Phase , 2013, IEEE Signal Processing Letters.

[29]  Nicolas Sturmel,et al.  Iterative phase reconstruction of wiener filtered signals , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..