论文信息 - A robust interaural time differences estimation and dereverberation algorithm based on the coherence function

A robust interaural time differences estimation and dereverberation algorithm based on the coherence function

Abstract A novel scheme of binaural sound localization and dereverberation in reverberation environment is present in this paper. The performance of cross-correlation based traditional time-delay estimation method is degraded sharply in a reverberation environment. Some precedence effect models have been proposed to apply in cross-correlation functions, but these models are parameter-sensitive and the front-end processes are very complex. This paper firstly proposes a simple and effective time-delay estimation method based on a coherence function in which the absolute values of coherence function is used to judge the reliability of the frequency-domain signal. And then the estimated time-delay values were applied to the coherent-to-diffuse power ratio (CDR) estimator, which can be used for reverberation suppression. Experimental results showed that the proposed scheme has higher localization accuracy than traditional methods and achieve a higher PESQ scores than other CDR estimators.

[1] W. Hartmann,et al. Localization of sound in rooms. V. Binaural coherence and human sensitivity to interaural time differences in noise. , 2010, The Journal of the Acoustical Society of America.

[2] Rhee Man Kil,et al. Estimation of Interaural Time Differences Based on Zero-Crossings in Noisy Multisource Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Thomas Esch,et al. Model-Based Dereverberation Preserving Binaural Cues , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[4] W M Hartmann,et al. Localization of sound in rooms. IV: The Franssen effect. , 1989, The Journal of the Acoustical Society of America.

[5] Tim Brookes,et al. A Comparison of Computational Precedence Models for Source Separation in Reverberant Environments , 2013 .

[6] Jie Huang,et al. Sound localization in reverberant environment based on the model of the precedence effect , 1997 .

[7] Walter Kellermann,et al. Coherent-to-Diffuse Power Ratio Estimation for Dereverberation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8] Hyung-Min Park,et al. Non-stationary sound source localization based on zero crossings with the detection of onset intervals , 2008, IEICE Electron. Express.

[9] Jean-Marc Boucher,et al. A binaural system for the suppression of late reverberation , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[10] M. Lavandier,et al. ACOUSTICS2008/695 Speech segregation in rooms: Importance of the interferer interaural coherence , 2008 .

[11] Guy J. Brown,et al. A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments , 2016, INTERSPEECH.

[12] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .

[13] L A JEFFRESS,et al. A place theory of sound localization. , 1948, Journal of comparative and physiological psychology.

[14] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .

[15] B C Wheeler,et al. Localization of multiple sound sources with two microphones. , 2000, The Journal of the Acoustical Society of America.

[16] Philipos C. Loizou,et al. A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17] Hans Wallach,et al. The precedence effect in sound localization. , 1949, The American journal of psychology.

[18] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[19] R. Meddis,et al. Implementation details of a computation model of the inner hair‐cell auditory‐nerve synapse , 1990 .

[20] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[21] C. Faller,et al. Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.

[22] W. Lindemann. Extension of a binaural cross-correlation model by contralateral inhibition. I. Simulation of lateralization for stationary signals. , 1986, The Journal of the Acoustical Society of America.

[23] José Escolano,et al. Evaluation of generalized cross-correlation methods for direction of arrival estimation using two microphones in real environments , 2012 .

[24] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .

[25] H S Colburn,et al. Theory of binaural interaction based in auditory-nerve data. IV. A model for subjective lateral position. , 1978, The Journal of the Acoustical Society of America.

[26] Jont B. Allen,et al. Multimicrophone signal‐processing technique to remove room reverberation from speech signals , 1977 .

[27] R. Zelinski,et al. A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28] Philipos C. Loizou,et al. A Dual-Microphone Algorithm That Can Cope With Competing-Talker Scenarios , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[29] Keith D. Martin. Echo suppression in a computational model of the precedence effect , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[30] Hong Liu,et al. A binaural sound source localization model based on time-delay compensation and interaural coherence , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31] Xiaodong Li,et al. Binaural coherent-to-diffuse-ratio estimation for dereverberation using an ITD model , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[32] W. Hartmann,et al. Localization of sound in rooms, III: Onset and duration effects. , 1986, The Journal of the Acoustical Society of America.

[33] Michele Scarpiniti,et al. Cepstrum Prefiltering for Binaural Source Localization in Reverberant Environments , 2012, IEEE Signal Processing Letters.

[34] Arthur H. Benade,et al. Two‐ear correlation in the statistical sound fields of rooms , 1986 .

[35] Peter Vary,et al. A binaural room impulse response database for the evaluation of dereverberation algorithms , 2009, 2009 16th International Conference on Digital Signal Processing.

[36] Patrick M. Zurek,et al. The Precedence Effect , 1987 .

[37] Emanuel A. P. Habets,et al. Signal-to-reverberant ratio estimation based on the complex spatial coherence between omnidirectional microphones , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).