A new region search method based on DOA estimation for speech source localization by SRP-PHAT method

Steered Response Power-PHAse Transform (SRP-PHAT) method has been already proposed and investigated for the sound source localization. Grid search methods can be used to find global maximum of SRP, but they are so computationally expensive that can not be used in real-time applications. In this paper, we have proposed a SRP-based localization method which works in cascade with a DOA estimation module; i.e. first the direction of speaker is recognized by one of the DOA estimation methods; after that, we bound the search region to a space fragment around estimated direction of speaker; then we use SRP-PHAT algorithm computations and volume contraction methods (such as SRC and CFRC) on this fragmentized regions and decrease computational costs to a large extent. By use of the data collected from different (speaker) scenarios, we demonstrate the accuracy and speed gained by proposed method.

[1]  Harvey F. Silverman,et al.  A Fast Microphone Array SRP-PHAT Source Location Implementation using Coarse-To-Fine Region Contraction(CFRC) , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[2]  Brian Dunnigan,et al.  Scenario! , 2004, Cesare Zavattini Selected Writings.

[3]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[4]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[5]  Stanley T. Birchfield A unifying framework for acoustic localization , 2004, 2004 12th European Signal Processing Conference.

[6]  Sven Nordholm,et al.  Robust acoustic direction of arrival estimation using Root-SRP-PHAT, a realtime implementation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Ying Yu,et al.  A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Ying Yu,et al.  Performance of real-time source-location estimators for a large-aperture microphone array , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[10]  Harvey F. Silverman,et al.  Microphone array optimization by stochastic region contraction , 1991, IEEE Trans. Signal Process..

[11]  Sven Nordholm,et al.  Speaker localisation using the far-field SRP-PHAT in conference telephony , 2002 .