Evaluation of Joint Position-Pitch Estimation Algorithm for Localising Multiple Speakers in Adverse Acoustical Environments
暂无分享,去创建一个
Automatic speaker localisation, detection and tracking are important challenges in multi-channel hands-free communication systems. In particular, simultaneous localisation of different speakers is of great interest for multi-microphone noise reduction schemes. Besides position, another possible feature to distinguish between different speakers is the fundamental frequency (pitch) of the speakers’ voices. The recently proposed PositionPitch (PoPi) estimation algorithm combines speaker localisation based on well-known cross-correlation approaches with pitch estimation techniques. In this contribution we evaluate the robustness of a modified version of the PoPi algorithm for localising simultaneous speakers in a realistic environment including room reverberation and different signal-to-noise ratios (SNR). In order to improve robustness, we particularly focus on modifications of the frequency-domain phase transformation T {·} used by the original PoPi algorithm. Joint position-pitch estimation Methods for speaker localisation as in [5] use a two step approach to combine the estimate of pitch f0 and localisation, where in a first stage a pitch estimation algorithm is applied and in an second stage the direction of arrival (DoA) φ0 is determined. The approach used here automatically estimates pitch and position in one step using the so-called Popi plane ρ(φ, f0), i.e.,
[1] Ning Ma,et al. Integrating pitch and localisation cues at a speech fragment level , 2007, INTERSPEECH.
[2] Michael Wohlmayr,et al. Joint position-pitch extraction from multichannel audio , 2007, INTERSPEECH.
[3] Harald Romsdorfer,et al. COMPARISON OF SRP-PHAT AND MULTIBAND-POPI ALGORITHMS FOR SPEAKER LOCALIZATION USING PARTICLE FILTERS , 2010 .
[4] M. Kepesi,et al. Joint Position-Pitch Estimation for Multiple Speaker Scenarios , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.