Speech separation based on semi-blind kurtosis maximization with magnitude and energy distance

Frequency-domain blind source separation (BSS) is efficient for separating convolutive speeches by reducing time-domain convolutive mixtures to instantaneous mixtures of complex-valued speeches at each frequency bin, but suffers from permutation ambiguity. Considering that the semi-blind complex kurtosis maximization (KM) algorithm can separate complex-valued signals in a fixed order by incorporating magnitude priors about the sources as references, we here apply it to perform speech separation in frequency domain. As the closeness measure between the BSS estimate and the reference is vital for the semi-blind KM algorithm to extract a specific source when the reference is determined, we examine two different closeness measures in this study. One is based on magnitude of the reference that is originally used by the semi-blind KM algorithm, and the other is based on energy of the reference. We define a distance between the source of interest and the others in terms of the closeness measure, and compare the distances for frequency-domain speech signals and the performances of speech separation by using the two closeness measures. The results demonstrate that the distance using the new closeness measure is larger than that using the original one due to energy matching between the estimate and the reference, and the semi-blind KM using the new closeness measures obtains better performance for frequency-domain speech separation.

[1]  Wei Lu,et al.  ICA with Reference , 2006, Neurocomputing.

[2]  Hualiang Li,et al.  A Class of Complex ICA Algorithms Based on the Kurtosis Cost Function , 2008, IEEE Transactions on Neural Networks.

[3]  Lucas C. Parra,et al.  A SURVEY OF CONVOLUTIVE BLIND SOURCE SEPARATION METHODS , 2007 .

[4]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  J. Cardoso,et al.  Blind beamforming for non-gaussian signals , 1993 .

[6]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[7]  Vince D. Calhoun,et al.  Semi-blind kurtosis maximization algorithm applied to complex-valued fMRI data , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.

[8]  Wei Lu,et al.  Approach and applications of constrained ICA , 2005, IEEE Transactions on Neural Networks.

[9]  Qiu-Hua Lin,et al.  Constrained Complex-Valued ICA without Permutation Ambiguity Based on Negentropy Maximization , 2010, LVA/ICA.

[10]  Dennis R. Morgan,et al.  Permutation inconsistency in blind speech separation: investigation and solutions , 2005, IEEE Transactions on Speech and Audio Processing.

[11]  Qiu-Hua Lin,et al.  A Semi-blind Complex ICA Algorithm for Extracting a Desired Signal Based on Kurtosis Maximization , 2008, ISNN.

[12]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[13]  Qiu-Hua Lin,et al.  Frequency-Domain Blind Separation of Convolutive Speech Mixtures with Energy Correlation-Based Permutation Correction , 2010 .

[14]  Daniel W. E. Schobben,et al.  A frequency domain blind signal separation method based on decorrelation , 2002, IEEE Trans. Signal Process..