A Maximum Likelihood Approach to Single-channel Source Separation

This paper presents a new technique for achieving blind signal separation when given only a single channel recording. The main concept is based on exploiting a priori sets of time-domain basis functions learned by independent component analysis (ICA) to the separation of mixed source signals observed in a single channel. The inherent time structure of sound sources is reflected in the ICA basis functions, which encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single channel data and sets of basis functions. For each time point we infer the source parameters and their contribution factors. This inference is possible due to prior knowledge of the basis functions and the associated coefficient densities. A flexible model for density estimation allows accurate modeling of the observation and our experimental results exhibit a high level of separation performance for simulated mixtures as well as real environment recordings employing mixtures of two different sources.

[1]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[2]  J. Cardoso Infomax and maximum likelihood for blind source separation , 1997, IEEE Signal Processing Letters.

[3]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[4]  Barak A. Pearlmutter,et al.  A Context-Sensitive Generalization of ICA , 1996 .

[5]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[6]  Gareth Loy,et al.  Fast Computation of the Gabor Wavelet Transform , 2002 .

[7]  Justinian P. Rosca,et al.  REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2001 .

[8]  Tomohiro Nakatani,et al.  Listening to two simultaneous speeches , 1999, Speech Commun..

[9]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[10]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[11]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[12]  T J Sejnowski,et al.  Learning the higher-order structure of a natural sound. , 1996, Network.

[13]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[14]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[15]  Guy J. Brown,et al.  Separation of speech from interfering sounds based on oscillatory correlation , 1999, IEEE Trans. Neural Networks.

[16]  Aapo Hyvärinen,et al.  Sparse Code Shrinkage: Denoising of Nongaussian Data by Maximum Likelihood Estimation , 1999, Neural Computation.

[17]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[18]  Daniel P. W. Ellis,et al.  A computer implementation of psychoacoustic grouping rules , 1993, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[19]  Te-Won Lee,et al.  The statistical structures of male and female speech signals , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[20]  Shun-ichi Amari,et al.  Blind source separation-semiparametric statistical approach , 1997, IEEE Trans. Signal Process..

[21]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[22]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[23]  Barak A. Pearlmutter,et al.  Blind source separation by sparse decomposition , 2000, SPIE Defense + Commercial Sensing.

[24]  A. J. Bell,et al.  A Unifying Information-Theoretic Framework for Independent Component Analysis , 2000 .

[25]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[26]  Ho-Young Jung,et al.  Speech feature extraction using independent component analysis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[27]  Eric A. Wan,et al.  Neural dual extended Kalman filtering: applications in speech enhancement and monaural blind signal separation , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[28]  Kunio Kashino,et al.  A Sound Source Separation System with the Ability of Automatic Tone Modeling , 1993, International Conference on Mathematics and Computing.

[29]  Philippe Garat,et al.  Blind separation of mixture of independent sources through a quasi-maximum likelihood approach , 1997, IEEE Trans. Signal Process..

[30]  Peter J. W. Rayner,et al.  Single channel separation using linear time varying filters: separability of non-stationary stochastic signals , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[31]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[32]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[33]  Jean-François Cardoso,et al.  Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..

[34]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[35]  Mark D. Plumbley,et al.  IF THE INDEPENDENT COMPONENTS OF NATURAL IMAGES ARE EDGES, WHAT ARE THE INDEPENDENT COMPONENTS OF NATURAL SOUNDS? , 2001 .