Blind dereverberation of monaural speech signals based on harmonic structure

This paper presents a new method for dereverberating monaural speech signals. Speech signals captured by a distant microphone usually contain a lot of reverberation, which severely degrades the performance of speech applications such as automatic speech recognition (ASR) systems. Although a number of dereverberation methods have been proposed, dereverberation remains a challenging problem, especially when using a single microphone. To overcome this problem, we propose a dereverberation method based on an inherent property of speech signals, namely, their harmonic structure. We show that a filter that enhances the harmonic structure of reverberant speech signals approximates an inverse filter of the reverberation process, and can achieve high-quality speech dereverberation. Experimental results show that a dereverberation filter trained with a sufficient amount of observed reverberant signals can effectively reduce the signal reverberation when the reverberation time is 0.1 to 1.0 s. © 2006 Wiley Periodicals, Inc. Syst Comp Jpn, 37(6): 1–12, 2006; Published online in Wiley InterScience (). DOI 10.1002sscj.20509

[1]  Tomohiro Nakatani,et al.  One Microphone Blind Dereverberation Based on Quasi-periodicity of Speech Signals , 2003, NIPS.

[2]  Masashi Unoki,et al.  A method based on the MTF concept for dereverberating the power envelope from the reverberant signal , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[3]  Tomohiro Nakatani,et al.  Blind dereverberation of single channel speech signal based on harmonic structure , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Les E. Atlas,et al.  Strategies for improving audible quality and speech recognition accuracy of reverberant speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5]  Tomohiro Nakatani,et al.  Robust fundamental frequency estimation against background noise and spectral distortion , 2002, INTERSPEECH.

[6]  Tomohiro Nakatani,et al.  Computational Auditory Scene Analysis Based on Residue-driven Architecture and Its Application to Mixed Speech Recognition , 2002 .

[7]  K. Furuya Noise reduction and dereverberation using correlation matrix based on the multiple-input/output inverse-filtering theorem (MINT) , 2001 .

[8]  Qiang Hou,et al.  Model adaptation based on HMM decomposition for reverberant speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[10]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[11]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[12]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[13]  B. S. Ramakrishna,et al.  Intelligibility of speech under nonexponential decay conditions. , 1975, The Journal of the Acoustical Society of America.

[14]  M. Schroeder New Method of Measuring Reverberation Time , 1965 .