Robust multiple speech source localization using time delay histogram

Spatial aliasing and spatial resolution are the two issues faced by most multiple speech source localization methods. The histogram of time delays is a simple but effective method to deal with these two issues on linear arrays. But few methods were capable of applying the time delay histogram to directional-of-arrivals (DOAs) estimation using a planar array. This paper proposes a novel method to estimate DOAs of multiple speech sources based on time delay histograms across all microphones of a planar array. The pairwise time delays of different sources are firstly obtained from each time delay histogram, and then, the time delays are identified with variant speech sources. Eventually, the DOA of each source is estimated by regression over its associated time delays. We conducted some experiments in both simulated and real environments to evaluate the proposed method using an eight-element circular array. The experimental results confirmed not only its high computational efficiency, but also its superiority in spatial resolution and spatial anti-aliasing.

[1]  Yonghong Yan,et al.  A closed-form method of spatial de-aliasing for multiple speech source localization , 2015, 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[2]  M. Viberg,et al.  Two decades of array signal processing research: the parametric approach , 1996, IEEE Signal Process. Mag..

[3]  B C Wheeler,et al.  Localization of multiple sound sources with two microphones. , 2000, The Journal of the Acoustical Society of America.

[4]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[5]  Sven Nordholm,et al.  Robust Source Localization in Reverberant Environments Based on Weighted Fuzzy Clustering , 2009, IEEE Signal Processing Letters.

[6]  Yuexian Zou,et al.  A Novel Multiple Sparse Source Localization Using Triangular Pyramid Microphone Array , 2012, IEEE Signal Processing Letters.

[7]  Jeffrey L. Krolik,et al.  Multiple broad-band source location using steered covariance matrices , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  Maximo Cobos,et al.  Robust acoustic source localization based on modal beamforming and time-frequency processing using circular microphone arrays. , 2012, The Journal of the Acoustical Society of America.

[9]  Yonghong Yan,et al.  Robust and Fast Localization of Single Speech Source Using a Planar Array , 2013, IEEE Signal Processing Letters.

[10]  Bhaskar D. Rao,et al.  A Two Microphone-Based Approach for Source Localization of Multiple Speech Sources , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Hiroshi Sawada,et al.  Doa Estimation for Multiple Sparse Sources with Normalized Observation Vector Clustering , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[13]  Jacob Benesty,et al.  Time Delay Estimation in Room Acoustic Environments: An Overview , 2006, EURASIP J. Adv. Signal Process..

[14]  Jiangtao Xi,et al.  Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.