Improving audio source localization by learning the precedence effect

Audio source localization in reverberant environments is difficult for automated microphone array systems. Certain features observable in the audio signal, such as sudden increases in audio energy, provide cues to indicate time-frequency regions that are particularly useful for audio localization, but previous approaches have not systematically exploited these cues. We learn a mapping from reverberated signal spectrograms to localization precision using ridge regression. The resulting mappings exhibit behavior consistent with the well-known precedence effect from psychoacoustic studies. Using the learned mappings, we demonstrate improved localization performance.