Bi-Direction Interaural Matching Filter and Decision Weighting Fusion for Sound Source Localization in Noisy Environments

Sound source localization is an essential technique in many applications, e.g., speech enhancement, speech capturing and humanrobot interaction. However, the performance of traditional methods degrades in noisy or reverberant environments, and it is sensitive to the spatial location of sound source. To solve these problems, we propose a sound source localization framework based on bi-direction interaural matching filter (IMF) and decision weighting fusion. Firstly, bi-directional IMF is put forward to describe the difference between binaural signals in forward and backward directions, respectively. Then, a hybrid interaural matching filter (HIMF), which is obtained by the bi-direction IMF through decision weighting fusion, is used to alleviate the affection of sound locations on sound source localization. Finally, the cosine similarity between the HIMFs computed from the binaural audio and transfer functions is employed to measure the probability of the source location. Constructing the similarity for all the spatial directions as a matrix, we can determine the source location by Maximum A Posteriori (MAP) estimation. Compared with several state-of-the-art methods, experimental results indicate that HIMF is more robust in noisy environments. key words: binaural auditory, sound source localization, hybrid interaural matching filter, decision weighting fusion

[1]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[2]  Mauricio Kugler,et al.  An Approach for Sound Source Localization by Complex-Valued Neural Network , 2013, IEICE Trans. Inf. Syst..

[3]  Stephen E. Levinson,et al.  A Bayes-rule based hierarchical system for binaural sound source localization , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[5]  Hong Liu,et al.  A two-layer probabilistic model based on time-delay compensation for binaural sound localization , 2013, 2013 IEEE International Conference on Robotics and Automation.

[6]  Javier R. Movellan,et al.  Approaches and databases for online calibration of binaural sound localization for robotic heads , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Steven van de Par,et al.  A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[9]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[10]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[11]  Hong Liu,et al.  A new hierarchical binaural sound source localization method based on Interaural Matching Filter , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).