论文信息 - Multichannel audio upmixing based on non-negative tensor factorization representation

Multichannel audio upmixing based on non-negative tensor factorization representation

This paper proposes a new spatial audio coding (SAC) method that is based on parametrization of multichannel audio by sound objects using non-negative tensor factorization (NTF). The spatial parameters are estimated using perceptually motivated NTF model and are used for upmixing a downmixed and encoded mixture signal. The performance of the proposed coding is evaluated using listening tests, which prove the coding performance being on a par with conventional SAC methods. The novelty of the proposed coding is that it enables controlling the upmix content by meaningful objects.

Tuomas Virtanen | Joonas Nikunen | Miikka Vilermo

[1] Thomas Sporer,et al. PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .

[2] Methods for the subjective assessment of small impairments in audio systems , 2015 .

[3] Tuomas Virtanen,et al. Noise-to-mask ratio minimization by weighted non-negative matrix factorization , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4] Jürgen Herre,et al. MP3 Surround: Efficient and Compatible Coding of Multi-Channel Audio , 2004 .

[5] D. Fitzgerald,et al. Non-negative Tensor Factorisation for Sound Source Separation , 2005 .

[6] Heiko Purnhagen. LOW COMPLEXITY PARAMETRIC STEREO CODING IN MPEG-4 , 2004 .

[7] Tuomas Virtanen,et al. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Jeroen Breebaart,et al. Low Complexity Parametric Stereo Coding , 2004 .

[9] Sugato Chakravarty,et al. Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .

[10] Ville Pulkki,et al. Spatial Sound Reproduction with Directional Audio Coding , 2007 .

[11] Christof Faller,et al. Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[12] Emmanuel Vincent,et al. Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Tuomas Virtanen,et al. Object-Based Audio Coding Using Non-Negative Matrix Factorization for the Spectrogram Representation , 2010 .

[14] Alexey Ozerov,et al. Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.