A learning-based approach to robust binaural sound localization

Sound source localization is an important feature designed and implemented on robots and intelligent systems. Like other artificial audition tasks, it is constrained to multiple problems, notably sound reflections and noises. This paper presents a sound source azimuth estimation approach in reverberant environments. It exploits binaural signals in a humanoid robotic context. Interaural Time and Level Differences (ITD and ILD) are extracted on multiple frequency bands and combined with a neural network-based learning scheme. A cue filtering process is used to reduce the reverberations effects. The system has been evaluated with simulation and real data, in multiple aspects covering realistic robot operating conditions, and was proven satisfying and effective as will be shown and discussed in the paper.

[1]  Steve B. Furber,et al.  Optimal connectivity in hardware-targetted MLP networks , 2009, 2009 International Joint Conference on Neural Networks.

[2]  C. Faller,et al.  Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.

[3]  S. Kagami,et al.  Real-time 2 dimensional sound source localization by 128-channel huge microphone array , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).

[4]  Jean-Luc Zarader,et al.  A binaural sound source localization method using auditive cues and vision , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Paris Smaragdis,et al.  Position and Trajectory Learning for Microphone Arrays , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Munsang Kim,et al.  Sound source localization in reverberant environment using visual information , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jean-Luc Zarader,et al.  Towards a systematic study of binaural cues , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  S. Loutridis,et al.  Quantifying sound-field diffuseness in small rooms using multifractals. , 2009, The Journal of the Acoustical Society of America.

[9]  Tobias Rodemann A study on distance estimation in binaural sound localization , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Martin Heckmann,et al.  Auditory Inspired Binaural Robust Sound Source Localization in Echoic and Noisy Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  L. Rayleigh,et al.  XII. On our perception of sound direction , 1907 .

[12]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[13]  Steven van de Par,et al.  A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Rong Liu,et al.  Azimuthal source localization using interaural coherence in a robotic dog: modeling and application , 2010, Robotica.

[15]  DeLiang Wang,et al.  Binaural Localization of Multiple Sources in Reverberant and Noisy Environments , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Jean Rouat,et al.  Robust 3D Localization and Tracking of Sound Sources Using Beamforming and Particle Filtering , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[17]  D. R. Campbell,et al.  A MATLAB Simulation of “ Shoebox ” Room Acoustics for use in Research and Teaching , 2022 .

[18]  Pavel Zahorik,et al.  Assessing auditory distance perception using virtual acoustics. , 2002, The Journal of the Acoustical Society of America.

[19]  Lei Zhang,et al.  On cross correlation based-discrete time delay estimation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20]  Harald Viste,et al.  Binaural Source Localization by Joint Estimation of ILD and ITD , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  M A Lord Rayleigh,et al.  On Our Perception of the Direotion of a Source of Sound , 1875 .