A perceptually motivated approach for speech enhancement

A new perceptually motivated approach is proposed for enhancement of speech corrupted by colored noise. The proposed approach takes into account the frequency masking properties of the human auditory system and reduces the perceptual effect of the residual noise. This new perceptual method is incorporated into a frequency-domain speech enhancement method and a subspace-based speech enhancement method. A better power spectrum/autocorrelation function estimator was also developed to improve the performance of the proposed algorithms. Objective measures and informal listening tests demonstrated significant improvements over other methods when tested with TIMIT sentences corrupted by various types of colored noise.

[1]  A. Walden,et al.  Spectral analysis for physical applications : multitaper and conventional univariate techniques , 1996 .

[2]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[3]  Robert M. Gray,et al.  On the asymptotic eigenvalue distribution of Toeplitz matrices , 1972, IEEE Trans. Inf. Theory.

[4]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[5]  Peter Jax,et al.  A novel psychoacoustically motivated audio enhancement algorithm preserving background noise characteristics , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Richard H. Bartels,et al.  Algorithm 432 [C2]: Solution of the matrix equation AX + XB = C [F4] , 1972, Commun. ACM.

[7]  John Mourjopoulos,et al.  Speech enhancement based on audible noise suppression , 1997, IEEE Trans. Speech Audio Process..

[8]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[9]  Andrew Sekey,et al.  An Objective Measure for Predicting Subjective Quality of Speech Coders , 1992, IEEE J. Sel. Areas Commun..

[10]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[11]  C. Swanson On spectral estimation , 1962 .

[12]  Mark Klein,et al.  Signal subspace speech enhancement with perceptual post-filtering , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Robert E. Yantorno,et al.  Performance of the modified Bark spectral distortion as an objective speech quality measure , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[14]  Nam C. Phamdo,et al.  Signal/noise KLT based approach for enhancing speech degraded by colored noise , 2000, IEEE Trans. Speech Audio Process..

[15]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[16]  Yi Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Processing Letters.

[17]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[18]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[19]  Cyrus Cantrell THE EFFECT OF NOISE ON THE SPECTRUM OF SPEECH APPROVED BY SUPERVISORY COMMITTEE: , 2002 .

[20]  Benoît Champagne,et al.  A perceptual signal subspace approach for speech enhancement in colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  B. S. Atal,et al.  PREDICTIVE CODING OF SPEECH USING ANALYSIS-BY-SYNTHESIS TECHNIQUES , 1990, 1990 Conference Record Twenty-Fourth Asilomar Conference on Signals, Systems and Computers, 1990..

[22]  Régine Le Bouquin-Jeannès,et al.  Optimizing speech enhancement by exploiting masking properties of the human ear , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[23]  Gerhard Stoll,et al.  ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .

[24]  S. Soli,et al.  Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. , 1994, The Journal of the Acoustical Society of America.

[25]  F. R. Gantmakher The Theory of Matrices , 1984 .

[26]  Tongxing Lu,et al.  Solution of the matrix equation AX−XB=C , 2005, Computing.

[27]  Louis L. Scharf,et al.  Multiwindow estimators of correlation , 1998, IEEE Trans. Signal Process..

[28]  D. Thomson,et al.  Spectrum estimation and harmonic analysis , 1982, Proceedings of the IEEE.

[29]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[30]  B. Atal,et al.  Predictive coding of speech signals and subjective error criteria , 1979 .

[31]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[32]  Peter Lancaster,et al.  The theory of matrices , 1969 .