论文信息 - Blind Separation of More Speech than Sensors with Less Distortion by Combining Sparseness and ICA

Blind Separation of More Speech than Sensors with Less Distortion by Combining Sparseness and ICA

We propose a method for separating speech signals with little distortion when the signals outnumber the sensors. Several methods have already been proposed for solving the underdetermined problem, and some of these utilize the sparseness of speech signals. These methods employ binary masks that extract a signal at time points where the number of active sources is estimated to be only one. However, these methods result in an unexpected excess of zeropadding and so the extracted speeches are severely distorted and have loud musical noise. In this paper, we propose combining a sparseness approach and independent component analysis (ICA). First, using sparseness, we estimate the time points when only one source is active. Then, we remove this single source from the observations and apply ICA to the remaining mixtures. Experimental results show that our proposed sparseness and ICA (SPICA) method can separate signals with little distortion even in a reverberant condition.

Hiroshi Sawada | Shoko Araki | Shoji Makino | Ryo Mukai | Audrey Blin

[1] Shoko Araki,et al. Blind Source Separation when Speech Signals Outnumber Sensors using a Sparseness - Mixing Matrix Estimation (SMME) , 2003 .

[2] Özgür Yilmaz,et al. On the approximate W-disjoint orthogonality of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Hiroshi Sawada,et al. Polar coordinate based nonlinear function for frequency-domain blind source separation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Yutaka Kaneda,et al. Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones , 2001 .

[5] K. Matsuoka,et al. A Robust Algorithm for Blind Separation of Convolutive Mixture of Sources , 2003 .

[6] Shoko Araki,et al. The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[7] Deniz Erdogmus,et al. Underdetermined blind source separation in a time-varying environment , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Hiroshi Sawada,et al. A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.