论文信息 - A Novel Quasi-Spherical Nested Microphone Array and Multiresolution Modified SRP by GammaTone Filterbank for Multiple Speakers Localization

A Novel Quasi-Spherical Nested Microphone Array and Multiresolution Modified SRP by GammaTone Filterbank for Multiple Speakers Localization

Multiple sound source localization is one of the most important applications in speech processing. The challenge in localization and tracking algorithms is to have better accuracy in noisy and reverberant environments. In the proposed method in this paper, a Quasi-Spherical Nested Microphone Array (QS-NMA) is suggested to eliminate the spatial aliasing and to be applicable for 3D sound source localization. In addition, the microphone signals related to QS-NMA are divided to different subbands by GammaTone filter bank based on the speech spectrum components. The subband processing is considered due to the W-Disjoint Orthogonality (W-DO) of speech signal specially in low frequencies. Then, the modified steered response power (SRP) is implemented based on the specific microphones of QS-NMA and subband signals. The modified SRP method is combined by ML and PHAT weighted functions adaptively and the peak positions of the modified SRP are extracted based on the number of speakers. This process is implemented on all subbands and the final histogram is calculated by combination of histograms for each subband. The 3D positions of all speakers are estimated by peak selections of the final histogram based on the number of speakers. The Proposed system is evaluated on different noisy and reverberant conditions and the superiority of the method is presented in comparison with other previous works. This system by using of QS-NMA localizes speakers in different directions with the same probability for speaker's positions in indoor conditions.

Ali Dehghan Firoozabadi | Pablo Adasme | Pablo Irarrazaval | Hugo Durney | Miguel Sanhueza-Olave

[1] N. Bershad,et al. Time delay estimation using the LMS adaptive filter--Dynamic behavior , 1981 .

[2] Jacck Izydorczyk,et al. Time delay estimation using the LMS adaptive filter , 2006 .

[3] Huawei Chen,et al. Multiple Speech Sources Localization in Room Reverberant Environment Using Spherical Harmonic Sparse Bayesian Learning , 2019, IEEE Sensors Letters.

[4] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .

[5] Rainer Martin,et al. Acoustic Source Localization with Microphone Arrays , 2008 .

[6] Tapio Lokki,et al. Interpolation Methods for the SRP-PHAT Algorithm , 2008 .

[7] K. Komatani,et al. Evaluation of Two-Channel-Based Sound Source Localization Using 3D Moving Sound Creation Tool , 2008, International Conference on Informatics Education and Research for Knowledge-Circulating Society (icks 2008).

[8] Susanto Rahardja,et al. Indoor Sound Source Localization With Probabilistic Neural Network , 2017, IEEE Transactions on Industrial Electronics.

[9] Alessio Brutti,et al. Localization of multiple speakers based on a two step acoustic map analysis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10] Eduardo Lleida,et al. Estimation of the Number of Speakers with Variational Bayesian PLDA in the DIHARD Diarization Challenge , 2018, INTERSPEECH.

[11] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[12] Youngjin Park,et al. Multiple sound sources localization using the spatially mapped GCC functions , 2009, 2009 ICCAS-SICE.