Auditory Inspired Binaural Robust Sound Source Localization in Echoic and Noisy Environments

We propose a new approach for binaural sound source localization in real world environments implementing a new model of the precedence effect. This enables the robust measurement of the localization cue values (ITD, UD and IED) in echoic environments. The system is inspired by the auditory system of mammals. It uses a Gammatone filter bank for preprocessing and extracts the ITD and IED cues via zero crossings (UD calculation is straight forward). The mapping between the cue values and the different angles is learned offline which facilitates the adaptation to different head geometries. The performance of the system is demonstrated by localization results for two simultaneous speakers and the mixture of a speaker, music, and fan noise in a normal meeting room. A real time demonstrator of the system is presented in T. Rodemann, et al. (2006)

[1]  Kazuhiro Nakadai,et al.  Sound source tracking with directivity pattern estimation using a 64 ch microphone array , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Parham Aarabi,et al.  Enhanced sound localization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Werner Hemmert,et al.  Automatic speech recognition with an adaptation model motivated by auditory processing , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Erik Berglund,et al.  Sound source localisation through active audition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  T. Zahn Neural architecture for echo suppression during sound source localization based on spiking neural cell models , 2004 .

[6]  B. Moore An introduction to the psychology of hearing, 3rd ed. , 1989 .

[7]  Rhee Man Kil,et al.  Sound segregation based on binaural zero-crossings , 2005, Interspeech.

[8]  Ea-Ee Jan,et al.  Sound source localization in reverberant environments using an outlier elimination algorithm , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Malcolm Slaney,et al.  An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .

[10]  Stefan Wermter,et al.  Auditory robotic tracking of sound sources using hybrid cross-correlation and recurrent networks , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Hiroshi G. Okuno,et al.  Active audition for humanoid robots that can listen to three simultaneous talkers , 2003 .

[12]  Martin Heckmann,et al.  Real-time Sound Localization With a Binaural Head-system Using a Biologically-inspired Cue-triple Mapping , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Kristian Kroschel,et al.  Reliability criteria evaluation for TDOA estimates in a variety of real environments , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[14]  Volker Willert,et al.  A Probabilistic Model for Binaural Sound Localization , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  C Kaernbach,et al.  Psychophysical evidence against the autocorrelation theory of auditory temporal processing. , 1998, The Journal of the Acoustical Society of America.

[16]  Rhee Man Kil,et al.  Auditory processing of speech signals for robust speech recognition in real-world noisy environments , 1999, IEEE Trans. Speech Audio Process..

[17]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .