Time-domain generalized cross correlation phase transform sound source localization for small microphone arrays

Due to hard- and software progress applications based on sound enhancement are gaining popularity. But such applications are often still limited by hardware costs, energy and real-time constraints, thereby bounding the available complexity. One task often accompanied with (multichannel) sound enhancement is the localization of the sound source. This paper focusses on implementing an accurate Sound Source Localizer (SSL) for estimating the position of a sound source on a digital signal processor, using as less CPU resources as possible. One of the least complex algorithms for SSL is a simple correlation, implemented in the frequency-domain for efficiency, combined with a frequency bin weighing for robustness. Together called Generalized Cross Correlation (GCC). One popular weighing called GCC PHAse Transform (GCC-PHAT) will be handled. In this paper it is explained that for small microphone arrays this frequency-domain implementation is inferior to its time-domain alternative in terms of algorithmic complexity. Therefore a time-domain PHAT equivalent will be described. Both implementations are compared in terms of complexity (clock cycles needed on a Texas Instruments C5515 DSP) and obtained results, showing a complexity gain with a factor of 146, with hardly any loss in localization accuracy.