Robotic Binaural Localization and Separation of Multiple Simultaneous Sound Sources

The problem of intelligent localization and separation of multiple concurrent sound sources using only two microphones, and without usage of a visual channel, is a challenging one. It has been solved with little success today. Most of the approaches that have been presented on this subject use complex microphone arrays to localize sound in three dimensions. Psycho-physical listening tests show that humans can locate up to six sound sources simultaneously. This localization occurs not only in azimuth but also in elevation and to a certain extent in distance. In addition, humans are able to identify these sources and to separate and process the information that they provide, including the tone, pitch, and semantics. They do not use any sensor array, but only two ears. In this study, we tackle the challenging task of robotic sound source localization and separation of more than two dynamic sources using only two microphones inserted into the ear canals of two artificial ears of a humanoid head. Exploiting specific properties of the sound signals and using self-splitting competitive learning combined with Bayesian fusion, we achieved promising results compared to state-of-the-art techniques in terms of localization accuracy and separation efficiency.

[1]  Juan Liang,et al.  Robust and low complexity localization algorithm based on head-related impulse responses and interaural time difference. , 2013, The Journal of the Acoustical Society of America.

[2]  C H Keller,et al.  Representation of multiple sound sources in the owl's auditory space map , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  Simon Carlile,et al.  The Perception of Auditory Motion , 2016, Trends in hearing.

[4]  DeLiang Wang,et al.  Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[5]  Jörg Lewald,et al.  The effect of brain lesions on sound localization in complex acoustic environments. , 2014, Brain : a journal of neurology.

[6]  Fakheredine Keyrouz A novel robotic sound localization and separation using non-causal filtering and Bayesian fusion , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[7]  Kotagiri Ramamohanarao,et al.  Multiple Self-Splitting and Merging Competitive Learning Algorithm , 2007, PAKDD.

[8]  Kazuhiro Nakadai,et al.  Partially Shared Deep Neural Network in sound source separation and identification using a UAV-embedded microphone array , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Jürgen T. Geiger,et al.  Localization of sound sources with known statistics in the presence of interferers , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Joshua D. Reiss,et al.  An Iterative Approach to Source Counting and Localization Using Two Distant Microphones , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11]  Hiroshi Sawada,et al.  MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and -Norm Minimization , 2007, EURASIP J. Adv. Signal Process..

[12]  R Plomp,et al.  Effect of multiple speechlike maskers on binaural speech recognition in normal and impaired hearing. , 1992, The Journal of the Acoustical Society of America.

[13]  DeLiang Wang,et al.  Two-Microphone Separation of Speech Mixtures , 2008, IEEE Transactions on Neural Networks.

[14]  Fakheredine Keyrouz Binaural range estimation using Head Related Transfer Functions , 2015, 2015 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).

[15]  Jayaganesh Swaminathan,et al.  Executive Function, Visual Attention and the Cocktail Party Problem in Musicians and Non-Musicians , 2016, PloS one.

[16]  Zhi-Qiang Liu,et al.  Self-splitting competitive learning: a new on-line clustering paradigm , 2002, IEEE Trans. Neural Networks.

[17]  Muhammad Imran,et al.  A methodology for sound source localization and tracking: Development of 3D microphone array for near-field and far-field applications , 2016, 2016 13th International Bhurban Conference on Applied Sciences and Technology (IBCAST).

[18]  Fakheredine Keyrouz,et al.  Advanced Binaural Sound Localization in 3-D for Humanoid Robots , 2014, IEEE Transactions on Instrumentation and Measurement.

[19]  Walter Kellermann,et al.  A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics , 2005, IEEE Transactions on Speech and Audio Processing.