Sound source localization using sparse coding and SOM

Many kinds of sound source localization systems have been developed for detecting a direction of sound source. They are commonly using time delay of arrival (TDOA) or interaural time difference (ITD) algorithm for sound source localization where, especially, the ITD is the difference in arrival time of a sound between two ears. It is largely changed depending on frequency components of sound even though the sound source is located in the same place. In this paper we propose a binaural sound localization system using sparse coding based ITD (S-ITD) and self-organizing map (SOM). The sparse coding is used for decomposing given sounds into three components: time, frequency and magnitude. Moreover we estimate the azimuth angle through the SOM. This localization system is installed in our robot that has two ears, head and body. We use PeopleBot as a body of the robot.

[1]  Terrence J. Sejnowski,et al.  Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations , 1998, NIPS.

[2]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[3]  Toshiharu Mukai,et al.  3D sound source localization system based on learning of binaural hearing , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[4]  Michael S. Lewicki,et al.  Efficient Coding of Time-Relative Structure Using Spikes , 2005, Neural Computation.

[5]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[6]  Mun-Sang Kim,et al.  Probabilistic sound source localization , 2007, 2007 International Conference on Control, Automation and Systems.