Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation

This paper proposes a new formulation and optimization procedure for grouping frequency components in frequency-domain blind source separation (BSS). We adopt two separation techniques, independent component analysis (ICA) and time-frequency (T-F) masking, for the frequency-domain BSS. With ICA, grouping the frequency components corresponds to aligning the permutation ambiguity of the ICA solution in each frequency bin. With T-F masking, grouping the frequency components corresponds to classifying sensor observations in the time-frequency domain for individual sources. The grouping procedure is based on estimating anechoic propagation model parameters by analyzing ICA results or sensor observations. More specifically, the time delays of arrival and attenuations from a source to all sensors are estimated for each source. The focus of this paper includes the applicability of the proposed procedure for a situation with wide sensor spacing where spatial aliasing may occur. Experimental results show that the proposed procedure effectively separates two or three sources with several sensor configurations in a real room, as long as the room reverberation is moderately low.

[1]  Andreas Ziehe,et al.  An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.

[2]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[3]  Walter Kellermann,et al.  Separating Convolutive Mixtures with Trinicon , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  Maurizio Omologo,et al.  Use of the crosspower-spectrum phase in acoustic event location , 1997, IEEE Trans. Speech Audio Process..

[5]  Hiroshi Sawada,et al.  MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and -Norm Minimization , 2007, EURASIP J. Adv. Signal Process..

[6]  Hiroshi Sawada,et al.  Near-field frequency domain blind source separation for convolutive mixtures , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Hiroshi Sawada,et al.  A NOVEL BLIND SOURCE SEPARATION METHOD WITH OBSERVATION VECTOR CLUSTERING , 2005 .

[8]  Te-Won Lee,et al.  Blind Source Separation Exploiting Higher-Order Frequency Dependencies , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Jacob Benesty,et al.  Time Delay Estimation , 2004 .

[10]  Hiroshi Sawada,et al.  Solving the Permutation Problem of Frequency-Domain BSS when Spatial Aliasing Occurs with Wide Sensor Spacing , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[12]  Justinian P. Rosca,et al.  REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2001 .

[13]  Ali H. Sayed,et al.  Linear Estimation (Information and System Sciences Series) , 2000 .

[14]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[15]  Michael S. Brandstein,et al.  A closed-form location estimator for use with room environment microphone arrays , 1997, IEEE Trans. Speech Audio Process..

[16]  Birger Kollmeier,et al.  Amplitude Modulation Decorrelation For Convolutive Blind Source Separation , 2000 .

[17]  P. Schultheiss,et al.  On Time Delay Estimation , 1994, IEEE Seventh SP Workshop on Statistical Signal and Array Processing.

[18]  Pau Bofill,et al.  Underdetermined blind separation of delayed sound sources in the frequency domain , 2003, Neurocomputing.

[19]  Tomohiro Nakatani,et al.  Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Rey Ramírez,et al.  Source localization , 2008, Scholarpedia.

[21]  Marc Delcroix,et al.  Precise Dereverberation Using Multichannel Linear Prediction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[23]  S. Rickard,et al.  DESPRIT - histogram based blind source separation of more sources than sensors using subspace methods , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[24]  Dennis R. Morgan,et al.  Permutation inconsistency in blind speech separation: investigation and solutions , 2005, IEEE Transactions on Speech and Audio Processing.

[25]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[26]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[27]  Shiro Ikeda,et al.  A METHOD OF ICA IN TIME-FREQUENCY DOMAIN , 2003 .

[28]  B. Kollmeier,et al.  Convolutive blind source separation of speech signals based on amplitude modulation decorrelation , 2000 .

[29]  Te-Won Lee,et al.  Independent Component Analysis , 1998, Springer US.

[30]  Yutaka Kaneda,et al.  Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones , 2001 .

[31]  Arthur H. M. van Roermund,et al.  Unsupervised adaptive filtering, volume I: blind source separation [Book Review] , 2002, IEEE Circuits and Devices Magazine.

[32]  Hiroshi Sawada,et al.  On Calculating the Inverse of Separation Matrix in Frequency-Domain Blind Source Separation , 2006, ICA.

[33]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[34]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[35]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[36]  Hiroshi Sawada,et al.  Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Don H. Johnson,et al.  Array Signal Processing: Concepts and Techniques , 1993 .

[38]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[39]  Kiyohiro Shikano,et al.  Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..

[40]  Ingvar Claesson,et al.  Direction of Arrival Estimation for Multiple Speakers Using Time-Frequency Orthogonal Signal Separation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[41]  Atsuo Hiroe,et al.  Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.