Epipolar geometry based sound localization and extraction for humanoid audition

Sound localization for a robot or an embedded system is usually solved by using inter-aural phase difference (IPD) and inter-aural intensity difference (IID). These values are calculated by using head-related transfer function (HRTF). However, the HRTF depends on the shape of the head and also on changes of the environments. Therefore, sound localization without HRTF is needed for real-world applications. In this paper, we present a new sound localization method based on auditory epipolar geometry with motion control. The auditory epipolar geometry is an extension of an epipolar geometry in stereo vision to audition, and auditory and visual epipolar geometries can share the sound source direction. The key idea is to exploit additional inputs obtained by the motor control in order to compensate damages in the IPD and IID caused by reverberation of the room and the body of the robot. The proposed system can localize and extract simultaneously two sound sources in a real-world room.

[1]  Hiroaki Kitano,et al.  Active audition system and humanoid exterior design , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[2]  Hiroaki Kitano,et al.  Using Vision to Improve Sound Source Separation , 1999, AAAI/IAAI.

[3]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[4]  Hiroaki Kitano,et al.  Design and architecture of SIG the humanoid: an experimental platform for integrated perception in RoboCup humanoid challenge , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[5]  N. Murata,et al.  An On-line Algorithm for Blind Source Separation on Speech Signals , 1998 .

[6]  Tomohiro Nakatani,et al.  Harmonic sound stream segregation using localization and its application to speech stream segregation , 1999, Speech Commun..

[7]  Brian Scassellati,et al.  A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.

[8]  Hiroaki Kitano,et al.  Active Audition for Humanoid , 2000, AAAI/IAAI.