Hybrid coding/indexing strategy for informed source separation of linear instantaneous under-determined audio mixtures

We present a system for under-determined source separation of non-stationary audio signals from a stereo 2-channel linear instantaneous mixture. This system is dedicated to isolate the different instruments/voices of a piece of music, so that an end-user can separately manipulate those source signals. The problem is addressed with a specific informed approach, that is implemented with a coder corresponding to the step of music production, and a separate decoder corresponding to the step of signal restitution. At the coder, source signals are assumed to be available, and are used to i) generate the stereo 2-channel mix signal, and ii) extract a small amount of distinctive features embedded into the mix signal using an inaudible watermarking technique. At the decoder, extracting and exploiting the watermark from the transmitted mix signal enables an end-user who has no direct access to the original source signals to separate these source signals from the mix signal. In the present study, we propose a new hybrid system that merges two techniques of informed source separation: a subset of the source signals are encoded using a "sources-channel coding" approach, and another subset are selected for local inversion of the mixture. The respective codes and indexes are transmitted to the decoder using a new high-capacity watermarking technique. At the decoder, the encoded source signals are decoded and then subtracted from the mixture signal, before local inversion of the remaining sub-mixture signal leads to the estimation of the second subset of source signals. This hybrid separation technique enables to efficiently combine the advantages of both coding and inversion approaches. We report experiments with 5 different source signals separated from stereo mixtures, with a remarkable quality, enabling separate manipulation during music restitution.

[1]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[2]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[3]  Mark D. Plumbley,et al.  Oracle estimation of adaptive cosine packet transforms for underdetermined audio source separation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Laurent Girin,et al.  Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Laurent Girin,et al.  A watermarking-based method for single-channel audio source separation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Laurent Girin,et al.  A Watermarking-Based Method for Informed Source Separation of Audio Signals With a Single Sensor , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Boualem Boashash,et al.  Separating More Sources Than Sensors Using Time-Frequency Distributions , 2005, EURASIP J. Adv. Signal Process..

[8]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Hiroshi Sawada,et al.  Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors , 2007, Signal Process..

[10]  Laurent Girin,et al.  A high-capacity watermarking technique for audio signals based on MDCT-domain quantization , 2010 .

[11]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[12]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[13]  Gregory W. Wornell,et al.  Quantization Index Modulation Methods for Digital Watermarking and Information Embedding of Multimedia , 2001, J. VLSI Signal Process..

[14]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[15]  Laurent Girin,et al.  Informed Source Separation of Linear Instantaneous Under-Determined Audio Mixtures by Source Index Embedding , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Jean-Francois Cardoso,et al.  Blind signal separation: statistical principles , 1998, Proc. IEEE.

[17]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[18]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[19]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[20]  Rémi Gribonval,et al.  Oracle estimators for the benchmarking of source separation algorithms , 2007, Signal Process..