Comparison of superimposition and sparse models in blind source separation by multichannel Wiener filter

Multichannel Wiener filter proposed by Duong et al: can conduct underdetermined blind source separation (BSS) with low distortion. This method assumes that the observed signal is the superimposition of the multichannel source images generated from multivariate normal distributions. The covariance matrix in each time-frequency slot is estimated by an EM algorithm which treats the source images as the hidden variables. Using the estimated parameters, the source images are separated as the maximum a posteriori estimate. It is worth nothing that this method does not assume the sparseness of sources, which is usually assumed in underdetermined BSS. In this paper we investigate the effectiveness of the three attributes of Duong's method, i.e., the source image model with multivariate normal distribution, the observation model without sparseness assumption, and the source separation by multichannel Wiener filter. We newly formulate three BSS methods with the similar source image model and the different observation model assuming sparseness, and we compare them with Duong's method and the conventional binary masking. Experimental results confirmed the effectiveness of all the three attributes of Duong's method.

[1]  Hiroshi Sawada,et al.  Blind source separation of mixed speech in a high reverberation environment , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[2]  Hiroshi Sawada,et al.  Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  戸上 真人 Statistical estimation theory considering time-varying nature of systems and source-probability distributions , 2011 .

[4]  Rémi Gribonval,et al.  Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[6]  Shigeki Sagayama,et al.  Sparseness-Based 2CH BSS using the EM Algorithm in Reverberant Environment , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[7]  Emmanuel Vincent,et al.  First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results , 2007, ICA.

[8]  Hiroshi Sawada,et al.  Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors , 2007, Signal Process..

[9]  Hiroshi Sawada,et al.  BLIND SPEECH SEPARATION BY COMBINING BEAMFORMERS AND A TIME FREQUENCY BINARY MASK , 2006 .

[10]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[11]  O. L. Frost,et al.  An algorithm for linearly constrained adaptive array processing , 1972 .