Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming

We propose a time-domain blind source separation (BSS) algorithm that utilizes geometric information such as sensor positions and assumed locations of sources. The algorithm tackles the problem of convolved mixtures by explicitly exploiting the non-stationarity of the acoustic sources. The learning rule is based on second-order statistics and is derived by natural gradient minimization. The proposed initialization of the algorithm is based on the null beamforming principle. This method leads to improved separation performance, and the algorithm is able to estimate long unmixing FIR filters in the time domain due to the geometric initialization. We also propose a post-filtering method for dewhitening which is based on the scaling technique in frequency-domain BSS. The validity of the proposed method is shown by computer simulations. Our experimental results confirm that the algorithm is capable of separating real-world speech mixtures and can be applied to short learning data sets down to a few seconds. Our results also confirm that the proposed dewhitening post-filtering method maintains the spectral content of the original speech in the separated output.

[1]  Christopher V. Alvino,et al.  Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[2]  S. Haykin Unsupervised adaptive filtering, vol. 1: Blind source separation , 2000 .

[3]  Kiyohiro Shikano,et al.  Bund source separation based on Multi-Stage ICA combining frequency-domain ICA and time-domain ICA , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Shoko Araki,et al.  Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Tetsunori Kobayashi,et al.  ASJ continuous speech corpus for research , 1992 .

[6]  Noboru Ohnishi,et al.  A method of blind separation for convolved non-stationary signals , 1998, Neurocomputing.

[7]  E. Oja,et al.  Independent Component Analysis , 2013 .

[8]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[9]  Shiro Ikeda,et al.  A METHOD OF ICA IN TIME-FREQUENCY DOMAIN , 2003 .

[10]  T. Ens,et al.  Blind signal separation : statistical principles , 1998 .

[11]  Shoko Araki,et al.  Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers , 2001, INTERSPEECH.

[12]  Xiaoan Sun,et al.  A NATURAL GRADIENT CONVOLUTIVE BLIND SOURCE SEPARATION ALGORITHM FOR SPEECH MIXTURES , 2001 .