Multichannel audio upmixing based on non-negative tensor factorization representation

This paper proposes a new spatial audio coding (SAC) method that is based on parametrization of multichannel audio by sound objects using non-negative tensor factorization (NTF). The spatial parameters are estimated using perceptually motivated NTF model and are used for upmixing a downmixed and encoded mixture signal. The performance of the proposed coding is evaluated using listening tests, which prove the coding performance being on a par with conventional SAC methods. The novelty of the proposed coding is that it enables controlling the upmix content by meaningful objects.

[1]  Thomas Sporer,et al.  PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .

[2]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[3]  Tuomas Virtanen,et al.  Noise-to-mask ratio minimization by weighted non-negative matrix factorization , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Jürgen Herre,et al.  MP3 Surround: Efficient and Compatible Coding of Multi-Channel Audio , 2004 .

[5]  D. Fitzgerald,et al.  Non-negative Tensor Factorisation for Sound Source Separation , 2005 .

[6]  Heiko Purnhagen LOW COMPLEXITY PARAMETRIC STEREO CODING IN MPEG-4 , 2004 .

[7]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Jeroen Breebaart,et al.  Low Complexity Parametric Stereo Coding , 2004 .

[9]  Sugato Chakravarty,et al.  Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .

[10]  Ville Pulkki,et al.  Spatial Sound Reproduction with Directional Audio Coding , 2007 .

[11]  Christof Faller,et al.  Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[12]  Emmanuel Vincent,et al.  Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Tuomas Virtanen,et al.  Object-Based Audio Coding Using Non-Negative Matrix Factorization for the Spectrogram Representation , 2010 .

[14]  Alexey Ozerov,et al.  Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.