Enhancement of Single-Channel Periodic Signals in the Time-Domain

Most state-of-the-art filtering methods for speech enhancement require an estimate of the noise statistics, but the noise statistics are difficult to estimate in practice when speech is present. Thus, nonstationary noise will have a detrimental impact on the performance of most speech enhancement filters. The impact of such noise can be reduced by using the signal statistics rather than the noise statistics in the filter design. For example, this is possible by assuming a harmonic model for the desired signal; while this model fits well for voiced speech, it will not be appropriate for unvoiced speech. That is, signal-dependent methods based on the signal statistics will introduce undesired distortion for some parts of speech compared to signal-independent methods based on the noise statistics. Since both the signal-independent and signal-dependent approaches to speech enhancement have advantages, it is relevant to combine them to reduce the impact of their individual disadvantages. In this paper, we give theoretical insights into the relationship between these different approaches, and these reveal a close relationship between the two approaches. This justifies joint use of such filtering methods which can be beneficial from a practical point of view. Our experimental results confirm that both signal-independent and signal-dependent approaches have advantages and that they are closely-related. Moreover, as a part of our experiments, we illustrate the practical usefulness of combining signal-independent and signal-dependent enhancement methods by applying such methods jointly on real-life speech.

[1]  Jae S. Lim,et al.  Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[3]  Jacob Benesty,et al.  Study of the Noise-Reduction Problem in the Karhunen–LoÈve Expansion Domain , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[5]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[6]  Andreas Jakobsson,et al.  Multi-Pitch Estimation , 2009, Multi-Pitch Estimation.

[7]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[8]  J. Capon Maximum-likelihood spectral estimation , 1979 .

[9]  Fabrice Plante,et al.  A pitch extraction reference database , 1995, EUROSPEECH.

[10]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[11]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[12]  Simon J. Godsill,et al.  Bayesian harmonic models for musical pitch estimation and analysis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Roland Badeau,et al.  A Parametric Method for Pitch Estimation of Piano Tones , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[14]  Søren Holdt Jensen,et al.  Reduction of broad-band noise in speech by truncated QSVD , 1995, IEEE Trans. Speech Audio Process..

[15]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[16]  Frankie K. W. Chan,et al.  Accurate frequency estimation for real harmonic sinusoids , 2004, IEEE Signal Processing Letters.

[17]  Y. Selen,et al.  Model-order selection: a review of information criterion rules , 2004, IEEE Signal Processing Magazine.

[18]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[19]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[20]  O. L. Frost,et al.  An algorithm for linearly constrained adaptive array processing , 1972 .

[21]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[22]  Petre Stoica,et al.  Spectral Analysis of Signals , 2009 .

[23]  George Carayannis,et al.  Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[24]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[25]  Andreas Jakobsson,et al.  Optimal Filter Designs for Separating and Enhancing Periodic Signals , 2010, IEEE Transactions on Signal Processing.

[26]  Andreas Jakobsson,et al.  Joint fundamental frequency and order estimation using optimal filtering , 2009, 2009 17th European Signal Processing Conference.

[27]  Jian Li,et al.  Computationally efficient parameter estimation for harmonic sinusoidal signals , 2000, Signal Process..

[28]  Jacob Benesty,et al.  Optimal Time-Domain Noise Reduction Filters - A Theoretical Study , 2011, Springer Briefs in Electrical and Computer Engineering.

[29]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[30]  Andreas Jakobsson,et al.  Joint High-Resolution Fundamental Frequency and Order Estimation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[32]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[33]  T. Subba Rao,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB , 2004 .