Blind Source Separation Exploiting Higher-Order Frequency Dependencies

Blind source separation (BSS) is a challenging problem in real-world environments where sources are time delayed and convolved. The problem becomes more difficult in very reverberant conditions, with an increasing number of sources, and geometric configurations of the sources such that finding directionality is not sufficient for source separation. In this paper, we propose a new algorithm that exploits higher order frequency dependencies of source signals in order to separate them when they are mixed. In the frequency domain, this formulation assumes that dependencies exist between frequency bins instead of defining independence for each frequency bin. In this manner, we can avoid the well-known frequency permutation problem. To derive the learning algorithm, we define a cost function, which is an extension of mutual information between multivariate random variables. By introducing a source prior that models the inherent frequency dependencies, we obtain a simple form of a multivariate score function. In experiments, we generate simulated data with various kinds of sources in various environments. We evaluate the performances and compare it with other well-known algorithms. The results show the proposed algorithm outperforms the others in most cases. The algorithm is also able to accurately recover six sources with six microphones. In this case, we can obtain about 16-dB signal-to-interference ratio (SIR) improvement. Similar performance is observed in real conference room recordings with three human speakers reading sentences and one loudspeaker playing music

[1]  R. Stephens,et al.  Acoustics and Vibrational Physics , 1966 .

[2]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[3]  William G. Gardner,et al.  The virtual acoustic room , 1992 .

[4]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[5]  Kari Torkkola,et al.  Blind separation of convolved sources based on information maximization , 1996, Neural Networks for Signal Processing VI. Proceedings of the 1996 IEEE Signal Processing Society Workshop.

[6]  R. Lambert Multichannel blind deconvolution: FIR matrix algebra and separation of multipath mixtures , 1996 .

[7]  Te-Won Lee,et al.  Blind Separation of Delayed and Convolved Sources , 1996, NIPS.

[8]  Ehud Weinstein,et al.  Multichannel signal separation: methods and analysis , 1996, IEEE Trans. Signal Process..

[9]  Te-Won Lee,et al.  Independent Component Analysis , 1998, Springer US.

[10]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Stephan Weiss,et al.  On adaptive filtering in oversampled subbands , 1998 .

[12]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[13]  Kazuya Takeda,et al.  Evaluation of blind signal separation method using directivity pattern under reverberant conditions , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[14]  Birger Kollmeier,et al.  Amplitude Modulation Decorrelation For Convolutive Blind Source Separation , 2000 .

[15]  Terrence J. Sejnowski,et al.  ICA Mixture Models for Unsupervised Classification of Non-Gaussian Classes and Automatic Context Switching in Blind Signal Separation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[17]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[18]  B. Kollmeier,et al.  Convolutive blind source separation of speech signals based on amplitude modulation decorrelation , 2000 .

[19]  Andreas Ziehe,et al.  An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.

[20]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[21]  Dennis R. Morgan,et al.  A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Michael S. Lewicki,et al.  Unsupervised image classification, segmentation, and enhancement using ICA mixture models , 2002, IEEE Trans. Image Process..

[23]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[24]  Nobuhiko Kitawaki,et al.  Combined approach of array processing and independent component analysis for blind separation of acoustic signals , 2003, IEEE Trans. Speech Audio Process..

[25]  William S. Rayens,et al.  Independent Component Analysis: Principles and Practice , 2003, Technometrics.

[26]  M. Lewicki,et al.  Learning higher-order structures in natural images , 2003, Network.

[27]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[28]  Te-Won Lee,et al.  Modeling Nonlinear Dependencies in Natural Images using Mixture of Laplacian Distribution , 2004, NIPS.