Binaural Sound Source Localization based on Sub-band SNR Estimation

Sound Source Localization (SSL) has a wide application in speech separation, recognition and enhancement. Binaural sound source localization based on human spatial hearing mechanism is an important research field of SSL. The recent binaural SSL research is focused on the system robust against noise and reverberation. In order to improve the localization performance in degraded environment, this paper proposes an algorithm to adaptively select the ‘good’ sub-bands to compute the binaural localization cues. Firstly, sub-band Signal-Noise Ratio (SNR) is estimated based on the autocorrelation matrix of binaural sound signals. Then, Inter-aural Time Difference (ITD) is computed by adaptively selecting the sub-bands which have the high SNR. Since the ITD is calculated through the sub-bands which are less affected by the noise, the sound source azimuth is estimated more accurate. The simulation results show that compared to the conventional binaural SSL algorithm, the localization accuracy of the proposed algorithm has been improved significantly.

[1]  Michele Scarpiniti,et al.  Cepstrum Prefiltering for Binaural Source Localization in Reverberant Environments , 2012, IEEE Signal Processing Letters.

[2]  Richard M. Stern,et al.  Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain , 2009, INTERSPEECH.

[3]  Li-xin Sun,et al.  Binaural Sound Localization Based on Detection of Multi-band Zero-Crossing Points , 2009, 2009 Second International Conference on Intelligent Networks and Intelligent Systems.

[4]  Gökhan Ince,et al.  Using binaural and spectral cues for azimuth and elevation localization , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Wee Ser,et al.  Speech detection using microphone array , 2000 .

[6]  Steven van de Par,et al.  A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Richard M. Stern,et al.  Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings , 2009, Speech Commun..

[8]  Steven van de Par,et al.  A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Richard M. Stern,et al.  Missing Feature Speech Recognition using Dereverberation and Echo Suppression in Reverberant Environments , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  Stephen E. Levinson,et al.  A Bayes-rule based hierarchical system for binaural sound source localization , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[11]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[12]  DeLiang Wang,et al.  Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  DeLiang Wang,et al.  Binaural tracking of multiple moving sources , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14]  Rhee Man Kil,et al.  Estimation of Interaural Time Differences Based on Zero-Crossings in Noisy Multisource Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Richard M. Stern,et al.  Binaural sound source separation motivated by auditory processing , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Jean-Luc Zarader,et al.  A binaural sound source localization method using auditive cues and vision , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  DeLiang Wang,et al.  Binaural Detection, Localization, and Segregation in Reverberant Environments Based on Joint Pitch and Azimuth Cues , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Harald Viste,et al.  Binaural Source Localization by Joint Estimation of ILD and ITD , 2010, IEEE Transactions on Audio, Speech, and Language Processing.