Subband Based Blind Source Separation with Appropriate Processing for Each Frequency Band

We propose subband-based blind source separation (BSS) for convolutive mixtures of speech. This is motivated by the drawback of frequency-domain BSS, i.e., when a long frame with a xed frame-shift is used for a few seconds of speech, the number of samples in each frequency bin decreases and the separation performance is degraded. In our proposed subband BSS, (1) by using a moderate number of subbands, a sufcien t number of samples can be held in each subband, and (2) by using FIR lters in each subband, we can handle long reverberation. Subband BSS achieves better performance than frequency-domain BSS. Moreover, subband BSS allows us to select the separation method suited to each subband. Using this advantage, we propose ecien t separation procedures that take the frequency characteristics of room reverberation and speech signals into consideration, (3) by using longer unmixing lters in low frequency bands, and (4) by adopting overlap-blockshift in BSS’s batch adaptation in low frequency bands. Consequently, subband processing appropriate for each frequency bin is successfully realized with the proposed subband BSS.

[1]  S. Haykin Unsupervised adaptive filtering, vol. 1: Blind source separation , 2000 .

[2]  Shoko Araki,et al.  Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[3]  Shoko Araki,et al.  Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  Shoko Araki,et al.  The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[5]  Sven Nordholm,et al.  Blind signal separation using overcomplete subband representation , 2001, IEEE Trans. Speech Audio Process..

[6]  S. Biyiksiz,et al.  Multirate digital signal processing , 1985, Proceedings of the IEEE.

[7]  Shoko Araki,et al.  Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers , 2001, INTERSPEECH.

[8]  Kazuya Takeda,et al.  Blind source separation combining frequency-domain ICA and beamforming , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Yunxin Zhao,et al.  Subband-based adaptive decorrelation filtering for co-channel speech separation , 2000, IEEE Trans. Speech Audio Process..

[10]  Kiyohiro Shikano,et al.  Bund source separation based on Multi-Stage ICA combining frequency-domain ICA and time-domain ICA , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  M. Portnoff,et al.  Implementation of the digital phase vocoder using the fast Fourier transform , 1976 .

[12]  Terrence J. Sejnowski,et al.  Blind separation and blind deconvolution: an information-theoretic approach , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.