论文信息 - Speaker Localization and Speech Separationin Two Echoic Mixtures

Speaker Localization and Speech Separationin Two Echoic Mixtures

We are developing two crucial improvements on the time-frequency masking approach to the blind speech separation of underdetermined mixtures when processing anechoic and echoic mixtures. First, the proposed method copes with the usually large amount of delay estimation error that appears in a low frequency band. This step generates a restrictive mask for phase delays on the basis of local and global energy distribution analysis. This mask allows the selected cells to contribute to the orientation histogram. Second, the strong WDO assumption (disjoint orthogonal frequency domain) is relaxed by allowing some frequency bins to be shared by both sources. By detecting fundamental frequencies of speakers at instantaneous time points, mask creation is supported by exploring their harmonic frequencies. The proposed method is proved to be effective and reliable in conducting experiments with both simulated and real-life mixtures. Article in English

Ning Ding | Nozomu Hamada | Włodzimierz Kasprzak

[1] Nozomu Hamada,et al. Separation of speech mixture by time-frequency masking utilizing sound harmonics , 2009 .

[2] Te-Won Lee,et al. Blind Speech Separation , 2007, Blind Speech Separation.

[3] Rémi Gribonval,et al. A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture , 2010, IEEE Transactions on Signal Processing.

[4] Yuanqing Li,et al. K-hyperline clustering learning for sparse component analysis , 2009, Signal Process..

[5] Yutaka Kaneda,et al. Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones , 2001 .

[6] Scott Rickard,et al. The DUET Blind Source Separation Algorithm , 2007, Blind Speech Separation.

[7] Rémi Gribonval,et al. A robust method to count, locate and separate audio sources in a multichannel underdetermined mixture , 2008 .

[8] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[9] Yannick Deville,et al. A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources , 2005, Signal Process..