Speech Separation via Parallel Factor Analysis of Cross-Frequency Covariance Tensor

This paper considers separation of convolutive speech mixtures in frequency-domain within a tensorial framework. By assuming that components associated with neighboring frequency bins of the same source are still correlated, a set of cross-frequency covariance tensors with trilinear structure are established, and an algorithm consisting of consecutive parallel factor (PARAFAC) decompositions is developed. Each PARAFAC decompositon used in the proposed method can simultaneously estimate two neighboring frequency responses, one of which is a common factor with the subsequent crossfrequency covariance tensor, and thus could be used to align the permutations of the estimates in all the PARAFAC decompositions. In addition, the issue of identifiability is addressed, and simulations with synthetic speech signals are provided to verify the efficacy of the proposed method.

[1]  Lucas C. Parra,et al.  A SURVEY OF CONVOLUTIVE BLIND SOURCE SEPARATION METHODS , 2007 .

[2]  Rasmus Bro,et al.  A comparison of algorithms for fitting the PARAFAC model , 2006, Comput. Stat. Data Anal..

[3]  Nikos D. Sidiropoulos,et al.  Parallel factor analysis in sensor array processing , 2000, IEEE Trans. Signal Process..

[4]  Nikos D. Sidiropoulos,et al.  Blind PARAFAC receivers for DS-CDMA systems , 2000, IEEE Trans. Signal Process..

[5]  James T. Kwok,et al.  Advances in Neural Networks - ISNN 2010, 7th International Symposium on Neural Networks, ISNN 2010, Shanghai, China, June 6-9, 2010, Proceedings, Part I , 2010, ISNN.

[6]  Nikos D. Sidiropoulos,et al.  Batch and Adaptive PARAFAC-Based Blind Separation of Convolutive Speech Mixtures , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[8]  Qiu-Hua Lin,et al.  Frequency-Domain Blind Separation of Convolutive Speech Mixtures with Energy Correlation-Based Permutation Correction , 2010 .

[9]  Andreas Ziehe,et al.  An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.

[10]  Dinh-Tuan Pham,et al.  Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures , 2006, EURASIP J. Adv. Signal Process..

[11]  Lucas C. Parra,et al.  Convolutive Blind Source Separation Methods , 2008 .

[12]  Lieven De Lathauwer,et al.  Blind Identification of Underdetermined Mixtures by Simultaneous Matrix Diagonalization , 2008, IEEE Transactions on Signal Processing.