Evaluating real-time audio localization algorithms for artificial audition in robotics

Although research on localization of sound sources using microphone arrays has been carried out for years, providing such capabilities on robots is rather new. Artificial audition systems on robots currently exist, but no evaluation of the methods used to localize sound sources has yet been conducted. This paper presents an evaluation of various real-time audio localization algorithms using a medium-sized microphone array which is suitable for applications in robotics. The techniques studied here are implementations and enhancements of steered response power - phase transform beamformers, which represent the most popular methods for time difference of arrival audio localization. In addition, two different grid topologies for implementing source direction search are also compared. Results show that a direction refinement procedure can be used to improve localization accuracy and that more efficient and accurate direction searches can be performed using a uniform triangular element grid rather than the typical rectangular element grid.

[1]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[2]  H. Mizoguchi,et al.  Circular microphone array for robot's audition , 2004, Proceedings of IEEE Sensors, 2004..

[3]  Mohan M. Trivedi,et al.  Analysis of time-delay estimation in reverberant environments , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Jean Rouat,et al.  Robust sound source localization using a microphone array on a mobile robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[5]  Iain McCowan,et al.  Robust speech recognition using near-field superdirective beamforming with post-filtering , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[6]  Fredrik Gustafsson,et al.  Positioning using time-difference of arrival measurements , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Parham Aarabi,et al.  Robust sound localization in 0.18 /spl mu/m CMOS , 2005, IEEE Transactions on Signal Processing.

[8]  Hong-Seok Kim,et al.  Using a real-time, tracking microphone array as input to an HMM speech recognizer , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Hiroshi G. Okuno,et al.  An open source software system for robot audition HARK and its evaluation , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[10]  Parham Aarabi,et al.  Enhanced sound localization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Maurizio Omologo,et al.  Talker localization and speech recognition using a microphone array and a cross-powerspectrum phase analysis , 1994, ICSLP.

[12]  Michael S. Brandstein,et al.  A robust method for speech signal time-delay estimation in reverberant rooms , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  François Michaud,et al.  Spartacus attending the 2005 AAAI conference , 2007, Auton. Robots.

[14]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[15]  Hiroshi G. Okuno,et al.  A robot referee for rock-paper-scissors sound games , 2008, 2008 IEEE International Conference on Robotics and Automation.

[16]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[17]  Jean Rouat,et al.  Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering , 2007, Robotics Auton. Syst..

[18]  Parham Aarabi,et al.  EURASIP Journal on Applied Signal Processing 2003:4, 338–347 c ○ 2003 Hindawi Publishing Corporation The Fusion of Distributed Microphone Arrays for Sound Localization , 2002 .