Acoustic Localization Using Spatial Probability in Noisy and Reverberant Environments

In realistic acoustic sound source localization scenarios, we often encounter not only the presence of multiple simultaneous sound sources, but also reverberation and noise. We propose a novel multi-source localization method based on the spatial sound presence probability (SSPP). The SSPP can be computed using prior knowledge of the anechoic relative transfer functions (RTFs), which incorporate magnitude and phase information, and makes the approach general for any device and geometry. From the SSPP we can not only obtain multiple simultaneous sound source direction estimates, but also their spatial presence probability. The SSPP can be used for a probabilistic update of the estimated directions, and can further be used to determine the dominant sound source. We demonstrate the robustness of our method in challenging non-stationary scenarios for single- and multi-speaker localization in noisy and reverberant conditions. The proposed method still localizes a sound source at 8 m with an average error below 7°.

[1]  Hiroshi Sawada,et al.  Doa Estimation for Multiple Sparse Sources with Normalized Observation Vector Clustering , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Jörn Anemüller,et al.  A discriminative learning approach to probabilistic acoustic source localization , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[3]  Emanuel A. P. Habets,et al.  An iterative multichannel subspace-based covariance subtraction method for relative transfer function estimation , 2017, 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA).

[4]  Ivan Tashev,et al.  Synthesis of device-independent noise corpora for speech quality assessment , 2016, 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC).

[5]  Emanuel A. P. Habets,et al.  Near-field signal acquisition for smartglasses using two acoustic vector-sensors , 2016, Speech Commun..

[6]  Kiyohiro Shikano,et al.  Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..

[7]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[8]  Alastair H. Moore,et al.  Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Soumitro Chakrabarty,et al.  Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals , 2018, IEEE Journal of Selected Topics in Signal Processing.

[10]  Thomas Kailath,et al.  ESPRIT-estimation of signal parameters via rotational invariance techniques , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  V. G. Reju,et al.  An Efficient Multi-Source DOA Estimation Algorithm for Underdetermined System , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[13]  Ivan Tashev,et al.  Directional Interference Suppression Using a Spatial Relative Transfer Function Feature , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Michael S. Brandstein,et al.  A robust method for speech signal time-delay estimation in reverberant rooms , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Jacob Benesty,et al.  Microphone Arrays: Fundamental Concepts , 2010 .

[16]  Thomas Sikora,et al.  Noise robust relative transfer function estimation , 2006, 2006 14th European Signal Processing Conference.

[17]  DeLiang Wang,et al.  Robust Speaker Localization Guided by Deep Learning-Based Time-Frequency Masking , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Israel Cohen,et al.  Relative transfer function identification using speech signals , 2004, IEEE Transactions on Speech and Audio Processing.

[19]  Michael D. Zoltowski,et al.  Direction finding with uniform circular arrays via phase mode excitation and beamspace root-MUSIC , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Ville Pulkki,et al.  Spatial Sound Reproduction with Directional Audio Coding , 2007 .

[21]  Bhaskar D. Rao,et al.  Performance analysis of Root-Music , 1989, IEEE Trans. Acoust. Speech Signal Process..

[22]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[23]  Shengkui Zhao,et al.  Robust DOA estimation of multiple speech sources , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Patrick A. Naylor,et al.  Locata Challenge-Evaluation Tasks and Measures , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[25]  Haizhou Li,et al.  A learning-based approach to direction of arrival estimation in noisy and reverberant environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .