A robust interaural time differences estimation and dereverberation algorithm based on the coherence function

Abstract A novel scheme of binaural sound localization and dereverberation in reverberation environment is present in this paper. The performance of cross-correlation based traditional time-delay estimation method is degraded sharply in a reverberation environment. Some precedence effect models have been proposed to apply in cross-correlation functions, but these models are parameter-sensitive and the front-end processes are very complex. This paper firstly proposes a simple and effective time-delay estimation method based on a coherence function in which the absolute values of coherence function is used to judge the reliability of the frequency-domain signal. And then the estimated time-delay values were applied to the coherent-to-diffuse power ratio (CDR) estimator, which can be used for reverberation suppression. Experimental results showed that the proposed scheme has higher localization accuracy than traditional methods and achieve a higher PESQ scores than other CDR estimators.

[1]  W. Hartmann,et al.  Localization of sound in rooms. V. Binaural coherence and human sensitivity to interaural time differences in noise. , 2010, The Journal of the Acoustical Society of America.

[2]  Rhee Man Kil,et al.  Estimation of Interaural Time Differences Based on Zero-Crossings in Noisy Multisource Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Thomas Esch,et al.  Model-Based Dereverberation Preserving Binaural Cues , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  W M Hartmann,et al.  Localization of sound in rooms. IV: The Franssen effect. , 1989, The Journal of the Acoustical Society of America.

[5]  Tim Brookes,et al.  A Comparison of Computational Precedence Models for Source Separation in Reverberant Environments , 2013 .

[6]  Jie Huang,et al.  Sound localization in reverberant environment based on the model of the precedence effect , 1997 .

[7]  Walter Kellermann,et al.  Coherent-to-Diffuse Power Ratio Estimation for Dereverberation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Hyung-Min Park,et al.  Non-stationary sound source localization based on zero crossings with the detection of onset intervals , 2008, IEICE Electron. Express.

[9]  Jean-Marc Boucher,et al.  A binaural system for the suppression of late reverberation , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[10]  M. Lavandier,et al.  ACOUSTICS2008/695 Speech segregation in rooms: Importance of the interferer interaural coherence , 2008 .

[11]  Guy J. Brown,et al.  A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments , 2016, INTERSPEECH.

[12]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[13]  L A JEFFRESS,et al.  A place theory of sound localization. , 1948, Journal of comparative and physiological psychology.

[14]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[15]  B C Wheeler,et al.  Localization of multiple sound sources with two microphones. , 2000, The Journal of the Acoustical Society of America.

[16]  Philipos C. Loizou,et al.  A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Hans Wallach,et al.  The precedence effect in sound localization. , 1949, The American journal of psychology.

[18]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[19]  R. Meddis,et al.  Implementation details of a computation model of the inner hair‐cell auditory‐nerve synapse , 1990 .

[20]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[21]  C. Faller,et al.  Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.

[22]  W. Lindemann Extension of a binaural cross-correlation model by contralateral inhibition. I. Simulation of lateralization for stationary signals. , 1986, The Journal of the Acoustical Society of America.

[23]  José Escolano,et al.  Evaluation of generalized cross-correlation methods for direction of arrival estimation using two microphones in real environments , 2012 .

[24]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[25]  H S Colburn,et al.  Theory of binaural interaction based in auditory-nerve data. IV. A model for subjective lateral position. , 1978, The Journal of the Acoustical Society of America.

[26]  Jont B. Allen,et al.  Multimicrophone signal‐processing technique to remove room reverberation from speech signals , 1977 .

[27]  R. Zelinski,et al.  A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28]  Philipos C. Loizou,et al.  A Dual-Microphone Algorithm That Can Cope With Competing-Talker Scenarios , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Keith D. Martin Echo suppression in a computational model of the precedence effect , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[30]  Hong Liu,et al.  A binaural sound source localization model based on time-delay compensation and interaural coherence , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Xiaodong Li,et al.  Binaural coherent-to-diffuse-ratio estimation for dereverberation using an ITD model , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[32]  W. Hartmann,et al.  Localization of sound in rooms, III: Onset and duration effects. , 1986, The Journal of the Acoustical Society of America.

[33]  Michele Scarpiniti,et al.  Cepstrum Prefiltering for Binaural Source Localization in Reverberant Environments , 2012, IEEE Signal Processing Letters.

[34]  Arthur H. Benade,et al.  Two‐ear correlation in the statistical sound fields of rooms , 1986 .

[35]  Peter Vary,et al.  A binaural room impulse response database for the evaluation of dereverberation algorithms , 2009, 2009 16th International Conference on Digital Signal Processing.

[36]  Patrick M. Zurek,et al.  The Precedence Effect , 1987 .

[37]  Emanuel A. P. Habets,et al.  Signal-to-reverberant ratio estimation based on the complex spatial coherence between omnidirectional microphones , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).