Informed source separation: Underdetermined source signal recovery from an instantaneous stereo mixture

The present paper exposes a new technique that aims at solving an ill-posed source separation problem encountered in stereo mixtures. The proposed method is realized in an encoder-decoder framework: On the encoder side, a set of spectral envelopes is extracted from the original tracks, which are known. These envelopes are passed on to the decoder in attachment to the stereo mixture, whereas the frequency resolution of the former is adapted to the critical bands, and their magnitude is logarithmically quantized. On the decoder side, the mixture signal is decomposed by time-frequency selective iterative spatial filtering guided by a source activity index, which is derived from the spectral envelope values. A comparison with a similar algorithm reveals that the novel approach yields a higher perceptual audio quality at a much lower data rate.

[1]  Laurent Girin,et al.  Interactive Music with Active Audio CDs , 2010, CMMR.

[2]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[3]  B. Moore,et al.  Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. , 1983, The Journal of the Acoustical Society of America.

[4]  Laurent Girin,et al.  Informed Source Separation of Linear Instantaneous Under-Determined Audio Mixtures by Source Index Embedding , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[6]  Özgür Yilmaz,et al.  On the approximate W-disjoint orthogonality of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[8]  Daniel P. W. Ellis,et al.  Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Antoine Liutkus,et al.  Informed Source Separation Using Latent Components , 2010, LVA/ICA.

[10]  Te-Won Lee,et al.  Blind Speech Separation , 2007, Blind Speech Separation.

[11]  Birger Kollmeier,et al.  PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception , 2006, IEEE Transactions on Audio, Speech, and Language Processing.