Two-microphone source separation algorithm based on statistical modeling of angle distributions

In this paper we present a novel two-microphone sound source separation algorithm, which selects speech from the target speaker while suppressing signals from interfering sources. In this algorithm, which is refered to as SMAD-CW, we first estimate the direction of sound sources for each time-frequency bin using phase differences in the spectral domain. For each frame we assume that the angle distribution is a mixture of two distributions, one from the target and the other from the dominant noise source. For each mixture component we use the von Mises distribution, which is a close approximation to the wrapped normal distribution. The expectation-maximization (EM) algorithm is employed to obtain parameters of this mixture distribution. Using this statistical model, we perform maximum a posteriori (MAP) hypothesis testing in order to obtain appropriate binary masks. We demonstrate that the algorithm described in this paper provides speech recognition accuracy that is significantly better than that obtained using conventional approaches.