A two-stage algorithm for one-microphone reverberant speech enhancement

Under noise-free conditions, the quality of reverberant speech is dependent on two distinct perceptual components: coloration and long-term reverberation. They correspond to two physical variables: signal-to-reverberant energy ratio (SRR) and reverberation time, respectively. Inspired by this observation, we propose a two-stage reverberant speech enhancement algorithm using one microphone. In the first stage, an inverse filter is estimated to reduce coloration effects or increase SRR. The second stage employs spectral subtraction to minimize the influence of long-term reverberation. The proposed algorithm significantly improves the quality of reverberant speech. A comparison with a recent enhancement algorithm is made on a corpus of speech utterances in a number of reverberant conditions, and the results show that our algorithm performs substantially better.

[1]  M. Schroeder New Method of Measuring Reverberation Time , 1965 .

[2]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[3]  Jont B. Allen,et al.  Multimicrophone signal‐processing technique to remove room reverberation from speech signals , 1977 .

[4]  J. J. Jetzt Critical distance measurement of rooms from the sound energy spectral response , 1977 .

[5]  A. H. Koenig,et al.  Determination of masking-level differences in a reverberant environment. , 1975, The Journal of the Acoustical Society of America.

[6]  Ronald E. Crochiere,et al.  A study of complexity and quality of speech waveform coders , 1978, ICASSP.

[7]  Jont B. Allen,et al.  Invertibility of a room impulse response , 1979 .

[8]  Jae S. Lim,et al.  The unimportance of phase in speech enhancement , 1982 .

[9]  J. Allen Effects of small room reverberation on subjective preference , 1982 .

[10]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[11]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[12]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[13]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[14]  Peter Kabal,et al.  Reverberant speech enhancement using cepstral processing , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[15]  Gerald A. Studebaker,et al.  Acoustical Factors Affecting Hearing Aid Performance , 1992 .

[16]  Ea-Ee Jan,et al.  Spatially selective sound capture for speech and audio processing , 1993, Speech Commun..

[17]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[18]  Tyseer Aboulnasr,et al.  Inverse filtering of room impulse response for binaural recording playback through loudspeakers , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[20]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[21]  Hynek Hermansky,et al.  Study on the dereverberation of speech based on temporal envelope filtering , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[22]  Ken'ichi Furuya,et al.  Two-channel blind deconvolution for non-minimum phase impulse responses , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Bayya Yegnanarayana,et al.  Enhancement of reverberant speech using LP residual signal , 2000, IEEE Trans. Speech Audio Process..

[24]  Henrique S. Malvar,et al.  Speech dereverberation via maximum-kurtosis subband adaptive filtering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[25]  Michael S. Brandstein,et al.  Explicit Speech Modeling for Microphone Array Applications , 2001, Microphone Arrays.

[26]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[27]  Les E. Atlas,et al.  Acoustic diversity for improved speech recognition in reverberant environments , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  Guy J. Brown,et al.  Missing data speech recognition in reverberant conditions , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Tomohiro Nakatani,et al.  Blind dereverberation of single channel speech signal based on harmonic structure , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30]  DeLiang Wang,et al.  A one-microphone algorithm for reverberant speech enhancement , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[31]  Shoko Araki,et al.  The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[32]  Mingyang Wu,et al.  Pitch tracking and speech enhancement in noisy and reverberant environments , 2003 .