Fast Sound Source Localization Using Two-Level Search Space Clustering

Steered response power phase transform (SRP-PHAT) is a method that is widely used for robust sound source localization (SSL). However, since SRP-PHAT searches over a large number of candidate locations, it is too slow to run in real-time for large-scale microphone array systems. In this paper, we propose a robust two-level search space clustering method to speed-up SRP-PHAT-based SSL. The proposed method divides the candidate locations of the sound source into a set of groups and finds a small number of groups that are likely to contain the maximum power location. By searching within the small number of groups, the computational costs are reduced by 61.8% compared to a previously proposed method without loss of accuracy.

[1]  Hyunsoo Kim,et al.  Sound source localization for robot auditory systems , 2009, IEEE Transactions on Consumer Electronics.

[2]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[3]  D. Mitchell Wilkes,et al.  An application of passive human-robot interaction: human tracking based on attention distraction , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[4]  Ramani Duraiswami,et al.  Accelerated speech source localization via a hierarchical search of steered response power , 2004, IEEE Transactions on Speech and Audio Processing.

[5]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[6]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[7]  Jacob Benesty,et al.  A Generalized Steered Response Power Method for Computationally Viable Source Localization , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Volker Willert,et al.  A Probabilistic Model for Binaural Sound Localization , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[10]  Joseph H. DiBiase A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays , 2000 .

[11]  Bowon Lee A Vectorized Method for Computationally Efficient SRP-PHAT Sound Source Localization , .

[12]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[13]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[14]  Hong Liu,et al.  Sound Source Localization for HRI Using FOC-Based Time Difference Feature and Spatial Grid Matching , 2013, IEEE Transactions on Cybernetics.

[15]  J.M. Peterson,et al.  Analysis of Fast Localization Algorithms for Acoustical Environments , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[16]  Chris Kyriakakis,et al.  Hybrid algorithm for robust, real-time source localization in reverberant environments , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[17]  Parham Aarabi,et al.  Enhanced sound localization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  Cláudio Rosito Jung,et al.  GPU-based approaches for real-time sound source localization using the SRP-PHAT algorithm , 2013, Int. J. High Perform. Comput. Appl..