Robotic binaural localization and separation of more than two concurrent sound sources

We present a new binaural sound-source separation and localization technique for the underdetermined case where the present sound sources outnumber the available microphones. The proposed technique has access to a generic set of head-related transfer functions (HRTFs) and processes input signals obtained from two small microphones placed inside the ear canals of a robot humanoid head equipped with artificial ears and mounted on a torso. By exploiting sparse representations of the ear input signals, the 3D position of three concurrent sound sources is extracted by identifying the HRTFs that have filtered the sound signals. The sought HRTFs are estimated using a well-known self-splitting competitive learning technique for clustering. Simulation results demonstrated the performance of the new technique in localizing both azimuth and elevation angles for three concurrently active sound sources. The proposed method relies purely on auditive cues and provides an easy implementation on robotic platforms.

[1]  B C Wheeler,et al.  Localization of multiple sound sources with two microphones. , 2000, The Journal of the Acoustical Society of America.

[2]  Shoko Araki,et al.  A sparseness-mixing matrix estimation (SMME) solving the underdetermined BSS for convolutive mixtures , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Terry T. Takahashi,et al.  Localization and identification of concurrent sounds in the owl’s auditory space map. , 2009 .

[4]  Parham Aarabi,et al.  Theory and design of multirate sensor arrays , 2005, IEEE Transactions on Signal Processing.

[5]  Klaus Diepold,et al.  A Novel Humanoid Binaural 3D Sound Localization and Separation Algorithm , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[6]  Zhi-Qiang Liu,et al.  Self-splitting competitive learning: a new on-line clustering paradigm , 2002, IEEE Trans. Neural Networks.

[7]  Hiroshi Sawada,et al.  Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and L1-Norm Minimization , 2007, Blind Speech Separation.

[8]  Klaus Diepold,et al.  A New Method for Binaural 3-D Localization Based on Hrtfs , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Parham Aarabi,et al.  Acoustic robot navigation using distributed microphone arrays , 2004, Inf. Fusion.