Real-Time Sound Source Localization on Graphics Processing Units

Abstract Sound source localization is an important topic in microphone array signal processing applications, such as camera steering systems, human-machine interaction or surveillance systems. The Steered Response Power with Phase Transform (SRP- PHAT) algorithm is one of the most well-known approaches for sound source localization due to its good performance in noisy and reverberant environments. The algorithm analyzes the sound power captured by a microphone array on a grid of spatial points in a given room. While localization accuracy can be improved by using a high resolution spatial grid and a high number of microphones, performing the localization task in these circumstances requires a high computational demand. Graphics Processing Units (GPUs) are highly parallel programmable coprocessors that provide massive computation when the needed operations are properly parallelized. This paper analyzes the performance of a real-time sound source localization system whose processing is totally carried out on a GPU. The proposed implementation yields maximum parallelism by adapting the required computations to different GPU architectures (Tesla, Fermi and Kepler).