Audio source separation based on independent component analysis

This paper introduces the blind source separation (BSS) of convolutive mixtures of acoustic signals, especially speech. A statistical and computational technique, called independent component analysis (ICA), is examined. By achieving nonlinear decorrelation, nonstationary decorrelation, or time-delayed decorrelation, we can find source signals only from observed mixed signals. Particular attention is paid to the physical interpretation of BSS from the acoustical signal processing point of view. Frequency-domain BSS is shown to be equivalent to two sets of frequency domain adaptive microphone arrays, i.e., adaptive beamformers (ABFs). Although BSS can reduce reverberant sounds to some extent in the same way as ABF, it mainly removes the sound from the jammer direction. This is why BSS has difficulties with long reverberation in the real world. If sources are not "independent," the dependence results in bias noise when obtaining the correct unmixing filter coefficients. Therefore, the performance of BSS is limited by that of ABF. Although BSS is upper bounded by ABF, BSS has a strong advantage over ABF. BSS can be regarded as an intelligent version of ABF in the sense that it can adapt without any information on the array manifold or the target direction, and sources can be simultaneously active in BSS.

[1]  Schuster,et al.  Separation of a mixture of independent signals using time delayed correlations. , 1994, Physical review letters.

[2]  Allan Kardec Barros,et al.  Real world blind separation of convolved non-stationary signals , 1999 .

[3]  Shoko Araki,et al.  Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Hiroshi Sawada,et al.  Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Hiroshi Sawada,et al.  Polar coordinate based nonlinear function for frequency-domain blind source separation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hiroshi Sawada,et al.  Evaluation of separation and dereverberation performance in frequency domain blind source separation , 2004 .

[7]  Reinhold Orglmeister,et al.  Blind source separation of real world signals , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[8]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[9]  Shoko Araki,et al.  Separation and dereverberation performance of frequency domain blind source separation for speech in a reverberant environment , 2001, INTERSPEECH.

[10]  P. Schultheiss,et al.  On Time Delay Estimation , 1994, IEEE Seventh SP Workshop on Statistical Signal and Array Processing.

[11]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[12]  Hiroshi Sawada,et al.  Underdetermined blind separation for speech in real environments with sparseness and ICA , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Dennis R. Morgan,et al.  Permutation inconsistency in blind speech separation: investigation and solutions , 2005, IEEE Transactions on Speech and Audio Processing.

[14]  Nikolaos Mitianoudis,et al.  Audio source separation of convolutive mixtures , 2003, IEEE Trans. Speech Audio Process..

[15]  Noboru Ohnishi,et al.  A method of blind separation for convolved non-stationary signals , 1998, Neurocomputing.

[16]  Dorothea Kolossa,et al.  Nonlinear Postprocessing for Blind Speech Separation , 2004, ICA.

[17]  Te-Won Lee,et al.  Independent Component Analysis , 1998, Springer US.

[18]  Yutaka Kaneda,et al.  Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones , 2001 .

[19]  DeLiang Wang,et al.  Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[20]  Dennis R. Morgan,et al.  Exploring permutation inconsistency in blind separation of speech signals in a reverberant environment , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[21]  K. Matsuoka,et al.  A Robust Algorithm for Blind Separation of Convolutive Mixture of Sources , 2003 .

[22]  Tomohiro Nakatani,et al.  Spectral Subtraction Steered by Multi-Step Forward Linear Prediction For Single Channel Speech Dereverberation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[23]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[24]  Shiro Ikeda,et al.  A METHOD OF ICA IN TIME-FREQUENCY DOMAIN , 2003 .

[25]  Shoji Makino,et al.  Blind Source Separation of Convolutive Mixtures of Speech , 2003 .

[26]  Meir Feder,et al.  Multi-channel signal separation by decorrelation , 1993, IEEE Trans. Speech Audio Process..

[27]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[28]  Te-Won Lee,et al.  Complex FastIVA: A Robust Maximum Likelihood Approach of MICA for Convolutive BSS , 2006, ICA.

[29]  S. Rickard,et al.  REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2001 .

[30]  Andrzej Cichocki,et al.  Robust learning algorithm for blind separation of signals , 1994 .

[31]  Nobuhiko Kitawaki,et al.  Combined approach of array processing and independent component analysis for blind separation of acoustic signals , 2003, IEEE Trans. Speech Audio Process..

[32]  Kiyohiro Shikano,et al.  Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..

[33]  Birger Kollmeier,et al.  Amplitude Modulation Decorrelation For Convolutive Blind Source Separation , 2000 .

[34]  Pierre Comon,et al.  Blind separation of sources, part II: Problems statement , 1991, Signal Process..

[35]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[36]  Jean-François Cardoso,et al.  Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..

[37]  Hiroshi Sawada,et al.  Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[38]  Kiyotoshi Matsuoka,et al.  A neural net for blind separation of nonstationary signals , 1995, Neural Networks.

[39]  Hiroshi Sawada,et al.  A robust approach to the permutation problem of frequency-domain blind source separation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[40]  Shoko Araki,et al.  Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[41]  Daniel W. E. Schobben,et al.  A frequency domain blind signal separation method based on decorrelation , 2002, IEEE Trans. Signal Process..

[42]  Kiyohiro Shikano,et al.  High-Fidelity Blind separation of Acoustic Signals Using SIMO-Model-Based Independent Component Analysis , 2004 .

[43]  Hiroshi Sawada,et al.  A spatio-temporal fastICA algorithm for separating convolutive mixtures , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[44]  Scott C. Douglas,et al.  Blind Separation of Acoustic Signals , 2001, Microphone Arrays.

[45]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[46]  Xiaoan Sun,et al.  A NATURAL GRADIENT CONVOLUTIVE BLIND SOURCE SEPARATION ALGORITHM FOR SPEECH MIXTURES , 2001 .

[47]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[48]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[49]  Te-Won Lee,et al.  Blind Separation of Delayed and Convolved Sources , 1996, NIPS.

[50]  Shoko Araki,et al.  Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers , 2001, INTERSPEECH.

[51]  J. Cardoso,et al.  Blind beamforming for non-gaussian signals , 1993 .

[52]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[53]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[54]  Esfandiar Sorouchyari,et al.  Blind separation of sources, part III: Stability analysis , 1991, Signal Process..

[55]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[56]  Jean-Francois Cardoso,et al.  THE THREE EASY ROUTES TO INDEPENDENT COMPONENT ANALYSIS; CONTRASTS AND GEOMETRY , 2001 .

[57]  Walter Kellermann,et al.  Blind Source Separation for Convolutive Mixtures: A Unified Treatment , 2004 .

[58]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[59]  Shoko Araki,et al.  The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[60]  A. J. Bell,et al.  A Unifying Information-Theoretic Framework for Independent Component Analysis , 2000 .

[61]  Walter Kellermann,et al.  A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments , 2006, Signal Process..

[62]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[63]  Christian Jutten,et al.  Space or time adaptive signal processing by neural network models , 1987 .

[64]  Shoko Araki,et al.  Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[65]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[66]  Andreas Ziehe,et al.  An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.

[67]  Hiroshi Sawada,et al.  Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models , 2006, EURASIP J. Adv. Signal Process..

[68]  Scott C. Douglas,et al.  Convolutive blind separation of speech mixtures using the natural gradient , 2003, Speech Commun..

[69]  Maurizio Omologo,et al.  Use of the crosspower-spectrum phase in acoustic event location , 1997, IEEE Trans. Speech Audio Process..

[70]  Andrzej Cichocki,et al.  Robust neural networks with on-line learning for blind identification and blind separation of sources , 1996 .

[71]  Andrzej Cichocki,et al.  New learning algorithm for blind separation of sources , 1992 .