A novel biologically inspired neural network solution for robotic 3D sound source sensing

This paper presents a novel real-time robotic binaural sound localization method based on hierarchical fuzzy artificial neural networks and a generic set of head related transfer functions. The robot is a humanoid equipped with the KEMAR artificial head and torso. Inside the ear canals two small microphones play the role of the eardrums in collecting the impinging sound waves. The neural networks are trained using synthesized sound sources placed every 5° from 0° to 255° in azimuth, and every 5° from  − 45° to 80° in elevation. To improve generalization, the training data was corrupted with noise. Thanks to fuzzy logic, the method is able to interpolate at its output, locating with high accuracy sound sources at positions which were not used for training, even in presence of strong distortion. In order to achieve high localization accuracy, two different binaural cues are combined, namely the interaural intensity differences and interaural time differences. As opposed to microphone-array methods, the presented technique, uses only two microphones to localize sound sources in a real-time 3D environment.

[1]  Toshiharu Mukai,et al.  3D sound source localization system based on learning of binaural hearing , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[2]  Jezekiel Ben-Arie,et al.  Nonorthogonal signal representation by Gaussians and Gabor functions , 1995 .

[3]  A. A. Handzel,et al.  Biomimetic sound-source localization , 2002 .

[4]  H S Colburn,et al.  Theory of binaural interaction based on auditory-nerve data. III. Joint dependence on interaural time and amplitude differences in discrimination and detection. , 1978, The Journal of the Acoustical Society of America.

[5]  H S Colburn,et al.  Binaural sluggishness in the perception of tone sequences and speech in noise. , 2000, The Journal of the Acoustical Society of America.

[6]  Jezekiel Ben-Arie,et al.  An auditory localization model based on high-frequency spectral cues , 2007, Annals of Biomedical Engineering.

[7]  D. Asdourian,et al.  Effects of thalamic and limbic system lesions on self-stimulation. , 1966, Journal of comparative and physiological psychology.

[8]  Martin Vetterli,et al.  Plenacoustic function on the circle with application to HRTF interpolation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  C Trahiotis,et al.  Sensitivity to brief changes of interaural time and interaural intensity. , 2001, The Journal of the Acoustical Society of America.

[10]  L A JEFFRESS,et al.  A place theory of sound localization. , 1948, Journal of comparative and physiological psychology.

[11]  F. Keyrouz,et al.  Robotic Localization and Separation of Concurrent Sound Sources using Self-Splitting Competitive Learning , 2007, 2007 IEEE Symposium on Computational Intelligence in Image and Signal Processing.

[12]  Parham Aarabi,et al.  Theory and design of multirate sensor arrays , 2005, IEEE Transactions on Signal Processing.

[13]  Flemming Christensen,et al.  Directional resolution of head-related transfer functions required in binaural synthesis , 2005 .

[14]  Klaus Diepold,et al.  A New Method for Binaural 3-D Localization Based on Hrtfs , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[15]  J. C. Middlebrooks,et al.  Two-dimensional sound localization by human listeners. , 1990, The Journal of the Acoustical Society of America.

[16]  Klaus Diepold,et al.  Robust 3D Robotic Sound Localization Using State-Space HRTF Inversion , 2006, 2006 IEEE International Conference on Robotics and Biomimetics.

[17]  Petri Koistinen,et al.  Using additive noise in back-propagation training , 1992, IEEE Trans. Neural Networks.

[18]  F. Keyrouz,et al.  A Rational Hrtf Interpolation Approach for Fast Synthesis of Moving Sound , 2006, 2006 IEEE 12th Digital Signal Processing Workshop & 4th IEEE Signal Processing Education Workshop.

[19]  D W Grantham,et al.  Detection and discrimination of simulated motion of auditory targets in the horizontal plane. , 1986, The Journal of the Acoustical Society of America.

[20]  N. Durlach Equalization and Cancellation Theory of Binaural Masking‐Level Differences , 1963 .

[21]  Parham Aarabi,et al.  Acoustic robot navigation using distributed microphone arrays , 2004, Inf. Fusion.

[22]  Kuansan Wang,et al.  Auditory representations of acoustic signals , 1992, IEEE Trans. Inf. Theory.