A spherical cross-channel algorithm for binaural sound localization

This paper proposes a sound localization algorithm inspired by a cross-channel algorithm first studied by MacDonald et. al in 2008. The original algorithm assumes that the Head Related Transfer Functions (HRTFs) of the robotic head under study are precisely known, which is rarely the case in practice. Following the idea that any head is more or less spherical, the above assumption is relaxed by using HRTFs computed using a simple spherical head model with the same head radius as the robot head. In order to evaluate the proposed approach in realistic noisy conditions, an isotropic noise field is also computed and a precise definition of the Signal to Noise Ratio (SNR) in a binaural context is outlined. All these theoretical developments are finally assessed with simulated and experimental signals. Despite its simplicity, the proposed approach appears to be robust to noise and to provide reliable sound localization estimations in the frontal azimuthal plane.

[1]  Hiroaki Kitano,et al.  Active Audition for Humanoid , 2000, AAAI/IAAI.

[2]  H.G. Okuno,et al.  Computational Auditory Scene Analysis and its Application to Robot Audition , 2004, 2008 Hands-Free Speech Communication and Microphone Arrays.

[3]  Jean-Luc Zarader,et al.  From monaural to binaural speaker recognition for humanoid robots , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[4]  Martin Rothbucher,et al.  Low latency localization of multiple sound sources in reverberant environments. , 2011, The Journal of the Acoustical Society of America.

[5]  José Santos-Victor,et al.  Sound Localization for Humanoid Robots - Building Audio-Motor Maps based on the HRTF , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  J. MacDonald A localization algorithm based on head-related transfer functions. , 2008, The Journal of the Acoustical Society of America.

[7]  Tetsunori Kobayashi,et al.  Multi-person conversation via multi-modal interface - a robot who communicate with multi-user - , 1999, EUROSPEECH.

[8]  Tetsuya Ogata,et al.  Improvement of speaker localization by considering multipath interference of sound wave for binaural robot audition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  R. Duda,et al.  Range dependence of the response of a spherical head model , 1998 .

[10]  Hiroaki Kitano,et al.  Applying scattering theory to robot audition system: robust sound source localization and extraction , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[11]  Fakheredine Keyrouz Humanoid hearing: A novel three-dimensional approach , 2011, 2011 IEEE International Symposium on Robotic and Sensors Environments (ROSE).

[12]  I. Cohen,et al.  Generating nonstationary multisensor signals under a spatial coherence constraint. , 2008, The Journal of the Acoustical Society of America.

[13]  A. A. Handzel,et al.  Biomimetic sound-source localization , 2002 .

[14]  Dong-Soo Kwon,et al.  A robust online touch pattern recognition for dynamic human-robot interaction , 2010, IEEE Transactions on Consumer Electronics.

[15]  Klaus Diepold,et al.  A New Method for Binaural 3-D Localization Based on Hrtfs , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16]  Iwaki Toshima,et al.  Possibility of simplifying head shape with the effect of head movement for an acoustical telepresence robot: TeleHead , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  R. Nicol Représentation et perception des espaces auditifs virtuels , 2010 .

[18]  Jean Rouat,et al.  Enhanced robot audition based on microphone array source separation with post-filter , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[19]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[20]  Hiroaki Kitano,et al.  Auditory fovea based speech separation and its application to dialog system , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Peter Vary,et al.  A Semi-Analytical Model for the Binaural Coherence of Noise Fields , 2011, IEEE Signal Processing Letters.

[22]  Kazuhiro Nakadai,et al.  Real-time sound source orientation estimation using a 96 channel microphone array , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Patrick Danès,et al.  Information-theoretic detection of broadband sources in a coherent beamspace MUSIC scheme , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Juan Liang,et al.  Robust and low complexity localization algorithm based on head-related impulse responses and interaural time difference. , 2013, The Journal of the Acoustical Society of America.

[25]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[26]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.